Edited by IRWIN G. SARASON 


CONTEMPORARY 
oo 


PERSONALITY 


Symptomatic Cases 
x 

d oo 

Criterion Coses 


x 


x Moles 
o Femoles 


Number of Cases 


55 60 65 
Standard Score 


VAN NOSTRAND REINHOLD 


EAST-WEST PRESS 


ua 


CONTEMPORARY RESEARCH 
IN PERSONALITY 


THE UNIVERSITY SERIES IN PSYCHOLOGY 


Editor 


DAVID C. McCLELLAND 
Harvard University 


Davip C. BEARDSLEE and MICHAEL WERTHEIMER (Editors)—Readings in Per- 
ception 

SEYMOUR FISHER and SIDNEY E. CLEVELAND—Body Image and Personality 

JoHN W. ATKINSON (Editor)—Motives in Fantasy, Action and Society: A Method 
of Assessment and Study 

Davip C. MCCLELLAND, ALFRED L. BALDWIN, URIE BRONFENBRENNER, and FRED 
L. STRODTBECK—Talent and Society: New Perspectives in the Identification of 
Talent 

BERNARD M. Bass and IRWIN A. BERG (Editors)—Objective Approaches to Per- 
sonality Assessment 

IRwIN G. SARASON— Contemporary Research in Personality 


CONTEMPORARY RESEARCH 
IN PERSONALITY 


Edited by 
IRWIN G. SARASON 


Associate Professor of Psychology 
University of Washington 


(AN EAST-WEST EDITION ) 


VAN NOSTRAND REINHOLD COMPANY 


NEW YORK - LONDON -TORONTO - MELBOURNE a ee 
AFFILIATED EAST-WEST PRESS PVT.LTD. |,” = ¢ 
NEW DELHI “ 3 


. 

A, 
=a 

Be 2 


VAN NOSTRAND REINHOLD COMPANY 
450 West 33rd Street, New York 10001, N.Y. 
(Principal Office) 

Windsor House, 46 Victoria Street, London, S.W.1 
25 Hollinger Road, Toronto 16, Canada 
Copyright © 1962, By 
VAN NOSTRAND REINHOLD COMPANY 


137 
SAR 


No reproduction in any form of this book, eos or p 
rt (except for brief quotation in critical articles or reviews), 
ay be made without written authorization from the publishers. 


Ben AFFILIATED EAST-WEST PRESS PVT. LTD. 
a üs. ZO East-West Student Edition 1970 


Price in India Rs. 10.50 
Sales Territory : India, Burma, Ceylon, Pakistan, Afghanistan and Nepal 


Reprinted in India with the special permission of the original 
Publishers, Van Nostrand Reinhold Company, 
New York, U.S.A. and the copyright holders. 


This book has been published with the assistance of the Joint 
Indian-American Textbook Programme. 


Published by W. D. TenBroeck for AFFILIATED EAST-WEST 

PRESS PRIVATE LTD., C 57 Defence Colony New Delhi 3, 

India, and printed by S. M. Balsaver at USHA PRINTERS, 
National House, Tulloch Road, Bombay 1, India. 


To My 
MOTHER anp FATHER 


i 
Ñ 
5. Ka, an wh vi 
f en s aan R 


RYA 


Preface 


The understanding of human characteristics 
and behavior has long tantalized psychologists and 
students in related disciplines. Two ways of find- 
ing a path to this goal of understanding personality 
functioning are discernible. One is through the 
development of comprehensive theoretical frame- 
works in terms of which all or virtually all bẹ- 
havior can be interpreted. The other method 
aims at comprehensive data-gathering relevant to 
E Tornitiationt CA TA ated 
aspects of human behavior. 

A comprehensive theory has the advantage of 
providing an overall framework within which re- 
search can be planned and conducted. However, 
a limitation of the comprehensive or global theory 
is that, because of its very comprehensiveness, its 
relationship to specific situations and methods may 
be tenuous and equivocal. Because of this, many 
students of personality have been willing to defer 
theoretical comprehensiveness in order, at the pres- 
ent time, to firmly establish the techniques, data 
and relationships which may provide the basis 
for broader generalizations in the future. Many 
workers in many branches of psychology currently 
seem to prefer to proceed from observation to 
theory rather than from theory to observation. 

This latter strongly empirical approach char- 
acterizes contemporary research in personality. 
Many motivations can be offered to account for 
this shift in emphasis from global theorizing such 
as is found in Freud’s writings to empirically- 
oriented delimited theorizing. Perhaps the most 
potent ones have been (1) the determination of 
psychologists to gain acceptance of psychology as 
a legitimate area of science and (2) their conse- 
quent emphasis on an empirical basis for theoreti- 
cal concepts. Contemporary American psycholo- 
gists working within the field of personality have 
recognized and been influenced by the significant 
contributions of comprehensive theorists such as 
Freud but they, also, seem to have felt the need to 
break out on their own in a variety of directions in 


attempts to place the study of personality on as em- 
pirically rooted and testable a basis as is possible. 
This development need in no way imply a rejection 
of the role of theoretical constructions and specu- 
lations in evolving conceptions of personality, but 
rather indicates a recognition of the need to main- 
tain defensible and understandable relationships 
between generalization and theory, on the one 
‘hand, and research techniques and methodology 
on the other. 

The present collection of readings is predicated 
on the assumption that the personality research 
found in current psychological journals is fre- 
quently of quite a different order of investigation 
than that which would be suggested by a system- 
atic review of “traditional” personality theorists. 
The aim of this book is to provide the student 
with samples of empirical and theoretical issues, 
problems, and methods characteristic of much cur- 
rent research in personality. This aim does not 
carry with it the assumption that, at the present 
time, there is a well-delineated body of verified, 
well-established relationships within the field of 
personality. Were this the case one could, with 
good conscience, refer to a science of personality. 
This would be a serious exaggeration. It would 
not be a distortion, however, to envisage the field 
of personality as a science in the making and, 
indeed, the intent of this anthology is to document 
some of the beginnings which have been made in 
this direction. 

How such documentation is presented clearly 
must be influenced by one’s view of the field, 
Exhaustive coverage of all branches of personality 
research has not been attempted in the present 
collection of papers. What is exhaustive in terms 
of one view might not be exhausive in terms of 
another. While not offered as a formal definition, 
the editor sees the field of personality as being 
primarily concerned with the isolation of variables 
relevant to the understanding, prediction and 
manipulation of individual differences and human 


variability. Emphasis on individual differences as 
a defining property of personality, it should be un- 
derstood, does not require any assumptions con- 
cerning the constancy of personality characteristics 
for an individual. Not only may these character- 
istics change and vary for the individual, but, de- 
pending on environmental or situational factors, 
their influences on behavior may vary. 

Since the nature and number of the variables 
relevant to the understanding of human differences 
are (1) unknown at present and (2) in all proba- 
bility, quite large and complex, it is not surprising 
that personality investigators have sought freely to 
relate concepts and methods of many branches of 
psychology to the problem of human variability. 
For this reason, the problems, methods, and con- 
cepts reflected in a collection such as the present 
one must necessarily be diverse. The inclusion of 
an interesting and varied sample of activities and 
ideas within the field of personality has been a 
major aim of the editor. 

Three of this collection’s nine sections are de- 
voted to the progress and problems characteristic 
of a widely used class of techniques for determin- 
ing individual differences by means of personality 
tests. Section I illustrates some of the ways in 
which paper-and-pencil questionnaires and psy- 
chometric devices have been employed to tap per- 
sonality attributes. Section II presents examples 
of research involving measures derived from sub- 
ject’s fantasy production. Section III includes 
papers which are concerned with several general 
aspects of an assessment approach to personality. 
A goal of these first three sections is to suggest 
both the potentialities and the problems of method- 
ology inherent in the measurement of personality 
characteristics. 

Sections IV and V are concerned with the de- 
veloping individual and the influences of social and 
environmental factors on his behavior. The ma- 
terial in Section IV reflects, particularly, interest in 
observation of the child in both field and experi- 
mental settings and interest, also, in the problem 
of charting developmental changes through adult- 
hood. Development and behavior in the context 
of manipulation of social and cultural variables 
emerge as the major concern of the Section V 
articles. 

A distinction frequently made or implied is that 
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between the study of personality and individual 
differences on the one hand and experimental psy- 
chology on the other. While it is hoped that the 
contents of this book will be taken as evidence of 
the artificiality of this distinction, perhaps the ma- 
terial in Sections VI and VII presents the most 
telling evidence. These sections deal with prob- 
lems in the areas of perception and the self-concept 
in the case of Section VI and learning and factors 
affecting performance in Section VII. Certainly, 
the studies of perception and learning are among 
the most traditional in experimental psychology. 
It is interesting that in recent years the close inter- 
relationship of these areas with personality have 
been demonstrated frequently. 

The final sections of the book deal with deviant 
behavior, its analysis and treatment by means of 
psychological techniques. There are probably two 
major interests in studying deviant behavior for 
the student of personality. One is the intrinsic 
interest value of these deviations, and the other is 
the possible implications of explanations of abnor- 
mal behavior for understanding normal develop- 
ment. Section VIII includes articles dealing with 
(1) the study of deviant behavior by means of 
objective techniques and (2) the problems posed 
by attempts at their modification or elimination. 
Section IX is devoted to a valuable method fre- 
quently employed in attempts at chronicling and 
understanding behavior deviation. This method 
is that of the clinical case study, By its very 
nature the case study is typically more concerned 
with focusing intensively on the behavior of one 
or a small number of individuals than with achiev- 
ing representative sampling and experimental 
methodology. Among the major advantages of 
systematic case studies are the leads and hypoth- 
eses which they provide for future researches in 
which sampling problems can be reduced and ex- 
perimental control of relevant variables can be 
exerted. 

Although some of the articles in this collection 
attempt to review comprehensively broad research 
areas, most of them do not have this as an objec- 
tive. For the reader interested in additional gen- 
eral references, there will be found at the end of 
each of the collection’s nine sections a short list of 
books, most of them quite recent, which deal with 
the section’s subject matter. 
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SECTION I 


Paper 
and Pencil 
Measures 


of Personality 


The value of having available for diagnostic and 
selection purposes, questionnaires and paper and 
pencil tests which (1) are easily administered 
and scored and (2) can be objectively interpreted 
has impressed psychologists for many years. 
Along with the development of many different 
kinds of tests purporting to possess these char- 
acteristics there have arisen also questions con- 
cerning the nature of these tests and the factors 
influencing scores on them. In addition, there 
has been increasing interest in the relationship of 
personality tests to performance under various 
experimental conditions and to psychological 
theories. 

The first article in this section, McKinley and 
Hathaway’s, illustrates one facet of the develop- 
ment of the personality questionnaire most widely 
used by psychologists, the Minnesota Multiphasic 
Personality Inventory (MMPI). The approach 
reflected in the McKinley and Hathaway paper 
is, simply stated, that personality instruments 
should not only purport to be of diagnostic value 
but they should also empirically differentiate 
among individuals tested. 

One objection which some students of person- 
ality have raised concerning the use of true-false 
tests like the MMPI is that they are lacking in 
subtlety, largely because of the requirement that 
items be answered only as being either true or 
false for the individual. Meehl's article, in this 
section, performed the valuable task of showing 
the assertions that questionnaires are too obvious 
and too simplified may themselves be viewed as 
oversimplifications. The sort of argument pre- 
sented by Meehl has led many psychologists to 
question the logical basis for the division of per- 
sonality instruments into objective and projective 
tests. 

Another problem frequently raised in connec- 
tion with personality questionnaires is that of the 
determinants of a given subject's test score. For 
example, does a high score on a measure of de- 
pression reflect simply depressive tendencies? 
Edwards has shown that one determinant which 
should be considered in evaluating personality 
test scores is the social desirability of the test 
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items administered to the subject. His pioneer- 
ing contribution on this topic is included in this 
section. On the basis of the work of Edwards, 
Cronbach, and others, much interest has been 
expressed in the general problem of the nature 
of personality test items and the factors influ- 
encing subjects’ answers to them. Jackson and 
Messick, in their paper, present a stimulating anal- 
ysis of the role of the set of the subject in affect- 
ing his test-taking behavior. 

Once individuals’ scores on personality tests 
have been obtained, can one use these scores to 
predict behavior in various situations? One way 
of attacking this question is to develop new tests 
or to use existing ones which can be related to 
theoretical formulations, and then to perform 
theoretically relevant experiments using test scores 
as independent variables. The Sarason review 
article reflects the degree of interest in this ap- 
proach for one category of personality measures, 
anxiety scales. While this article presents a broad 
survey of one area of personality research, anx- 
iety, the paper by Berkowitz illustrates some of 
the methodological facets of an experimental ap- 
proach to personality as reflected in research on 
hostility. 


A MULTIPHASIC PERSONALITY 
SCHEDULE (MINNESOTA): 

IV. PSYCHASTHENIA * 

J. C. McKinLey AnD S. R. HATHAWAY? 


An earlier paper of this series (1) described 
the Multiphasic Personality Schedule and its pro- 
jected lines of development. The Multiphasic 
Schedule differs from traditional personality in- 
struments in the deliberate attempt to include 


* Reprinted by permission from The Journal of 
Applied Psychology, October, 1942, Vol. 26, No. 5, 
614-624. 

1 Prepared on Work Projects Administration Official 
Project No. 165-1-71-124, Sub-Project No. 379. Sup- 
ported in part by a research grant from the Graduate 
School of the University of Minnesota. 


among the items as many as possible of those 
that might give information of importance clin- 
ically, without regard to the particular phase of 
personality upon which the item may bear. This 
initial concept explains the name given to the 
inventory. The experimental schedule includes 
504 items which are printed upon separate cards. 
The subject responds to each card by filing it 
behind one of three index guide cards on which 
are printed “True,” “False,” or “Cannot Say.” 

The experimental items have now been admin- 
istered to about 3,000 individuals of various nor- 
mal and abnormal classifications. The chief nor- 
mal group against which all hospitalized abnormal 
groups are considered is comprised of adults to 
whom the items were administered when they 
came as visitors or brought patients to the Uni- 
versity Hospital. The only requirement for in- 
clusion of these persons as normal was that they 
said they were not under a doctor's care at the 
time of testing. The word “normal” as used 
herein never implies more than this. The normal 
group so obtained represents a reasonably ac- 
curate cross section of the Minnesota population 
with some over-emphasis of the rural population. 
The modal scholastic achievement is eighth grade 
and occupational ratings indicate that the modal 
occupational level is approximately that of the 
general adult population. 

Two scales have now been derived and pub- 
lished. The first of these was a scale for the 
measurement of hypochondriasis and the second 
one for the measurement of symptomatic depres- 
sion. The present paper treats with the measure- 
ment of psychasthenia.? 


A. DERIVATION OF THE PSYCHASTHENIA SCALE 


The psychiatric classification of psychasthenia 
is applied to a group of individuals whose thinking 
is characterized by excessive doubt, by compul- 
sions, obsessions, and unreasonable fears; these 
persons are often seen in psychiatric hospitals 
but are encountered much more frequently among 
normal groups by counselors and personnel work- 
ers. Certain phobias such as the fear of spiders, 
of snakes or of windstorms are widespread among 
the population, but similar phobias become so 
strong and so numerous in some persons as to 


? The Minnesota Multiphasic Personality Schedule 


is now available for purchase from The University 
Press, Minneapolis, Minnesota. 
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afford a source of considerable maladjustment 
vocationally, socially or otherwise. Often a psy- 
chasthenic individual is characterized not so much 
by well-marked fears of individual things or acts 
as by great doubts as to the meaning of his re- 
actions in what seems to be a hostile environment. 
In other cases the phobia becomes attached to 
certain acts or thoughts of the subject in such a 
way that he is forced through fear to compul- 
sively perform needless, disturbing or personally 
destructive acts or to dwell obsessively upon lines 
of thought which have no significance for his 
normal activities. 

Compulsive acts are always characterized b 
the need felt by the subject to perform them with- 
out regard to rational considerations. For exam- 
ple, he may always be forced to count objects or 
to touch a certain spot on a wall or to avoid step- 
ping on sidewalk cracks. If he fails to do these 
things he feels uncomfortable; if he does them 
he is forced to rationalize and justify his acts. 
Obsessive thinking is itself commonly accom- 
panied by anxiety so that the patient may be tense 
and anxious over the content of his thoughts as 
when he thinks over and over again that he is 
useless. Similarly, he may find himself anxiously 
obsessed with such ideas as the impending likeli- 
hood that he will faint or that something terrible 
or threatening is about to happen. Again, he 
may be forced to think things which, while not in 
themselves producing anxiety, through his im- 
patience and preoccupation with the fact that he 
cannot stop thinking them, do secondarily pro- 
duce an anxious reaction; for example, compul- 
sive counting itself has little attached anxiety 
since the patient is merely forced to count every- 
thing that he sees, but he may worry so much 
over his inability to stop counting as to have 
anxiety as a large component in his thinking. The 
general reaction type characterized by these com- 
pulsive and obsessive acts and thoughts is called 
psychasthenia. The word derives from the con- 
cept of a weakened will that cannot resist the 
behavior regardless of its maladaptive character. 

The development of a scale for the measure- 
ment of the general symptomatic traits which 
are classed under the psychiatric designation of 
psychasthenia, has demonstrated that there is an 
identifiable personality pattern underlying the 
varying symptomatic picture from case to case. 


Many of the items making up the psychasthenia 
scale are clearly much more general than the 
specific compulsions or phobias and apply to a 
more general personality make-up of which the 
subject is usually entirely unaware. 

The methods of derivation for the present scale 
differ only in detail from the methods used in 
the scales reported earlier (2,3). Unfortunately 
for the present study not many entirely satisfac- 
tory criterion cases of psychasthenia come into 
the closed wards of a psychiatric clinic, Many 
more are seen in the outpatient clinic or are ad- 
vised by lay counselors and are never severely 
handicapped. Because we have felt unsure in 
the use of even carefully studied inpatients for 
purposes of scale derivation, we have avoided 
using criterion cases from the outpatient clinic. 
The criterion group is thus small and not entirely 
homogeneous. At least one of the cases appears 
to have been incorrectly diagnosed. Fortunately, 
the trait itself is the most homogeneous one so 
far described so that correlations of items with 
total score could be used as a guide. Otherwise, 
we would hesitate to publish the results with so 
few criterion cases. 

The chief subjects for scale derivation con- 
sisted of (a) 139 normal married males between 
the ages of 26 and 43, and 200 normal married 
females between the ages of 26 and 43, (b) a 
group of 265 college students as a check of effect 
of age on item frequency, (c) a group of 20 
psychiatric patients carefully selected as probable 
psychasthenia cases. 

The criterion group included patients who had 
been intensively studied medically and psychia- 
trically and in whom the final diagnosis was psy- 
chasthenia in one or another form. Unfortu- 
nately, as mentioned above, it was necessary not 
only to use a rather small criterion group, but 
also to include in the group several persons who, 
as it subsequently developed, were probably not 
appropriate. For example, the two of this group 
who received the lowest final scores were young 
persons, one of 16 and the other of 17 years. 
One of these was not at all similar in item re- 
sponses to the remainder of the criterion group. 
It is probable that this 16-year-old boy was 
wrongly diagnosed. These two young patients 
are the two cases testing lowest in the criterion 
group. (See Fig. 1.) 
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Composite polygon of 690 normal cases constructed with the abscissa scaled 


in standard scores. Above the polygon are plotted the 20 criterion cases and 50 cases 
that were noted as having some symptomatic evidence of psychasthenia. 


A preliminary step in deriving this scale was 
the tabulation of the item responses of the cri- 
terion group in contrast to the norm groups of 
men and women and the college students con- 
sidered as a mixed normal group selected for age 
and for scholastic aptitude. All items that showed 
a differentiation of two or more times the stand- 
ard error of the difference between the criterion 
and all of the normal groups were chosen as a 
preliminary scale for psychasthenia. All avail- 
able normal and psychiatric cases were then 
scored on this preliminary scale. It was possible 
at this point to check whether or not the scale 
seemed to be working in the right direction and 
to determine its apparent variability. Since the 
scale as derived in this preliminary fashion ap- 
peared to be unusually homogeneous and since 
there were other potentially useful items that 
had been doubtful in the statistics of the com- 
parison of criterion and normal groups, item cor- 
relations were used to test all preliminary scale 
items as well as certain of the doubtful items not 
included in the preliminary scale. 

Tetrachoric correlations were obtained for 
every preliminary item and all the doubtful items 
against total scores on the preliminary scale for 
a sample of 100 normal persons and for a sample 
of 100 randomly selected Psychiatric patients, 
These data combined with the original comparison 


data of criterion and normal cases permitted us 
to select a final scale of 48 items. The following 
list contains all of these final scale items each 
followed by a “T” or an “F” to indicate the di- 
rection of the scored answer. After each item is 
given the tetrachoric correlation of the item with 
total score on the preliminary scale. The first 
figure is the correlation from normal cases and 
the second that from psychiatric cases. It was 
assumed that items were valid if they correlated 
with either group. In some cases only one cor- 
relation is given since the cell frequencies for a 
response might be too low to obtain a valid in- 
dication of the other correlation. A few of the 
items with low correlations were retained because 
the item had appeared very strong in the criterion 
group. The items are merely counted +1 when 
answered in the indicated direction. 

The distributions of Figure 1 show graphically 
the scores of the normal, the criterion cases and 
fifty symptomatic cases on the final scale. The 
Scores used in Figure 1 are standard score equiv- 
alents derived from the statistics for 293 normal 
males and 397 normal females between the ages 
of 16 and 45. Two standard score tables were 
used, one for the males and one for the females. 
This cancels the sex differences. (See Table 1.) 
The standard scores were fitted to a mean of 50 
and a standard deviation of 10, 


100 100 
normals psychiatric 


I seldom worry about my health F 40 65 
At times I have fits of laughing and crying that I cannot control T 59 47 
I seem to be about as capable and smart as most others around me F 80 47 
My memory seems to be all right F 71 
I feel weak all over much of the time T 75 61 
I cannot understand what I read as well as I used to T 43 63 
There seems to be a lump in my throat much of the time T 80 50 
I wake up fresh and rested most mornings F 65 80 
Most nights I go to sleep without thoughts or ideas bothering me F 32 17 
I almost never dream F 40 53 
I like to study and read about things that I am working at F 51 
I do many things which I regret afterwards (I regret things more 

or more often than others seem to) T 65 72 
In school I found it very hard to talk before the class P 57 44 
I am easily embarrassed F 56 70 
I am more sensitive than most other people it 77 53 
I easily become impatient with people T 52 
Even when I am with people I feel lonely much of the time T 55 81 
I wish I could be as happy as others seem to be A 60 67 
My daily life is full of things that keep me interested F 41 74 
I have had periods of days, weeks, or months when I couldn't take 

care of things because I couldn’t “get going” T 66 66 
I frequently find myself worrying about something A 85 76 
Most of the time I feel blue T 80 82 
Much of the time I feel as if I have done something wrong or evil T 76 
I feel anxiety about something or someone almost all the time T: 52 
Once a week or oftener I become very excited Ty 90 72 
I have periods of such great restlessness that I cannot sit long in a 

chair un 79 75 
Sometimes I become so excited that I find it hard to get to sleep T 63 50 
I forget right away what people say to me T 16 74 
I usually have to stop and think before I act even in trifling matters T 52 
I have a habit of counting things that are not important such as 

bulbs on electric signs, and so forth T: 45 67 
Sometimes some unimportant thought will run through my mind and 

bother me for days T 8 62 
Bad words, often terrible words, come into my mind and I cannot get 

rid of them E 48 71 
Often I cross the street in order not to meet someone I see T 80 80 
I have strange and peculiar thoughts a 63 19 
I get anxious and upset when I have to make a short trip away 

from home T 58 55 
Almost every day something happens to frighten me T 82 
I have been afraid of things or people that I knew could not hurt me T 41 20 
I have no dread of going into a room by myself where other people 

have already gathered and are talking F 35 44 
I am afraid of losing my mind T 60 
My hardest battles are with myself T 45 50 
I have more trouble concentrating than others seem to have T 90 72 
I have several times given up doing a thing because I thought too 

little of my ability T 53 
I find it hard to keep my mind on a task or job T 36 79 
I am inclined to take things hard T. 56 53 
Life is a strain for me much of the time T 50 84 
I certainly feel useless at times T 70 74 
I am certainly lacking in self-confidence T 52 
Once in a while I think of things too bad to talk about T 46 70 
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B. VALIDITY AND INCIDENTAL FINDINGS 
There was relatively little change in score with 


age. Table 1 shows the raw score statistics for 
TABLE 1 
Males Females 


N s N? s 


College students | 155 7.27 5.8 115 8.93 6.0 
Age 16-25 110 9.10 7.4 118 13.00 7.9 
26-35 110 10.54 7.2 165 12.19 7.7 
36-45 73 10.12 6.5 114 14.47 7.4 
46-55 40 10.90 7.8 59 12.31 6.7 
56-65 13 12.38 8.8 19 14.16 7.4 
ten-year intervals from 16 to 65. The college 


student group deviates markedly from the group 
of similar age chosen at random from the popula- 
tion. There is some difference between the sexes 
as observed with other scales but without further 
study no special significance should be attached 
to this difference. 

It is, unfortunately, not possible to estimate the 
validity of the psychasthenia scale by testing it 
on a new group of psychiatric patients diagnosed 
psychasthenia only but not used in the deriva- 
tion of the scale. A few new cases have been 
diagnosed by the clinic but another year will be 
required for the accumulation of a sufficiently 
large group to permit this type of statistical vali- 


dation. Nevertheless additional individuals so 
far obtained by clinical diagnosis have been devi- 
ates on the scale. 

The evidence of validity as given by the psy- 
chiatric cases with clinical symptoms of some 
degree of psychasthenia is relatively clear and 
positive. This has been shown by experience in 
the clinic but is more graphically shown in — 
Figure 1. The distribution marked symptomatic 
cases represents 50 psychiatric cases very hetero- 
geneous in diagnosis but with the one common 
characteristic that they were marked by the staff 
as having some symptomatic evidence of obses- 
sions or compulsions. Since none of these cases 
was finally diagnosed psychasthenia and since 
the clinician frequently overemphasizes symp- 
toms as seen in a person otherwise abnormal, 
the cases should not be expected to be uniformly 
high in the scale. Nevertheless the trend toward 
high scores for the group is clearly significant. 
Only ten per cent fall below the mean for the 
normal group. 

Table 2 lists the means and standard deviations 
of several groups in comparison to the normal 
group. In contrast to previously derived scales, 
the physically ill individuals from other portions 
of the hospital test very little above normals not 
in the hospital. Psychiatric cases without re- 
corded evidence of psychasthenia test above the 
normal average but the staff frequently fails to 
record the presence of some psychasthenic traits 
even though they are observed since the trends 
are not disabling. 


TABLE 2 


d/sa | or Exceeding 


Normals 690 
Criterion psychiatric 20 
Symptomatic psychiatric 50 
Other psychiatric 576 
Physically ill 266 
College students 270 


BEISA | 
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Several measures of reliability are available. 
For a group of 47 normal cases retested at in- 
tervals of never less than three days and up to 
more than a year, the test-retest reliability co- 
efficient is .74 + .15. Most of these cases were 
employees and staff but none knew that the test 
was to be repeated. The standard deviation of 
this group was 4.9 on the first test as compared to 
over 7 on general normals and undoubtedly the 
coefficient obtained represents a low limit rather 
than a true test-retest correlation value. The 
split half coefficient obtained from a group of 
200 random normal cases is .84 + .07. When a 
similar sample of 100 psychiatric cases selected 
at random is used, the correlation is .89 + .10. 
When these two correlations are statistically cor- 
rected to a full length test, they are .91 + .07 and 
94 + .10. 

The test intercorrelation with hypochondriasis 
as measured by H—Cy is .06 + .10 as obtained 
from 100 normals. The intercorrelation with 
depression as measured by D (symbols refer to 
the two previously published scales) on the same 
group was .44 + .10. When 100 miscellaneous 
psychiatric cases are used, these two correlations 
are, with H—Cy, .28 + .10 and with D, .69 + .10. 
The rise in the correlation with depression with 
psychiatric cases is probably to be expected since 
the complaint factors involved in psychasthenia 
are dynamically related to depression so that 
many persons tend to have the psychasthenic 
type of fears in greater degree as their morale 
becomes lower, and conversely to be reactively 
more depressed as they are troubled by psy- 
chasthenia. 


C. SUMMARY 


The psychiatric designation psychasthenia as 
used in the present study, refers to a group of 
individuals who are frequently troubled by com- 
pulsions, obsessions, and phobias and who are 
often disabled by vacillation, excessive worry and 
lack of confidence. Through the differential 
study of persons having psychiatric evidences of 
psychasthenia, a scale was derived which is in- 
ternally homogeneous and which differentiates 
clinic patients from normals in a large percentage 
of cases. Further evidence of validity is given 
by the fact that on the average persons exhibit- 
ing psychasthenic symptoms to only a minor de- 
gree score significantly higher than normals. 
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THE DYNAMICS OF “STRUCTURED” 
PERSONALITY TESTS * 


PAUL E. MEEHL 


In a recent article in this Journal (9), Lt. 
Max L. Hutt of the Adjutant General’s School 
has given an interesting discussion of the use 
of projective methods in the army medical in- 
stallations. This article was part of a series de- 
scribing the work of clinical psychologists in the 
military services, with which the present writer 
is familiar only indirectly. The utility of any 
instrument in the military situation can, of course, 
be most competently assessed by those in contact 
with clinical material in that situation, and the 
present paper is in no sense to be construed as an 
“answer” to or an attempted refutation of Hutt’s 
remarks. Nevertheless, there are some incidental 
observations contained in his article which war- 
rant further critical consideration, particularly 
those having to do with the theory and dynamics 
of “structured” personality tests. It is with these 
latter observations rather than the main burden 
of Hutt’s article that this paper is concerned. 

Hutt defines “structured personality tests” as 
those in which the test material consists of con- 
ventional, culturally crystallized questions to 

* Reprinted by permission from the Journal of 


Clinical Psychology, October, 1945, Vol. 1, No. 4, 
296-303. 
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which the subject must respond in one of a very 
few fixed ways. With this definition we have 
no quarrel, and it has the advantage of not apply- 
ing the unfortunate phrase “self-rating question- 
naire” to the whole class of question-answer de- 
vices. But immediately following this definition, 
Hutt goes on to say that “it is assumed that each 
of the test questions will have the same meaning 
to all subjects who take the examination. The 
subject has no opportunity of organizing in his 
own unique manner his response to the questions.” 

These statements will bear further examination. 
The statement that personality tests assume that 
each question has the same meaning to all sub- 
jects is continuously appearing in most sources of 
late, and such an impression is conveyed by many 
discussions even when they do not explicitly make 
this assertion. It should be emphasized very 
strongly, therefore, that while this perhaps has 
been the case with the majority of question-answer 
personality tests, it is not by any means part of 
their essential nature. The traditional approach 
to verbal question-answer personality tests has 
been, to be sure, to view them as self-ratings; and 
it is in a sense always a self-rating that you obtain 
when you ask a subject about himself, whether 
you inquire about his feelings, his health, his 
attitudes, or his relations to others. 

However, once a “self-rating” has been ob- 
tained, it can be looked upon in two rather differ- 
ent ways. The first, and by far the commonest 
approach, is to accept a self-rating as a second 
best source of information when the direct ob- 
servation of a segment of behavior is inaccessible 
for practical or other reasons. This view in 
effect forces a self-rating or self-description to 
act as surrogate for a behavior-sample. Thus we 
want to know whether a man is shy, and one 
criterion is his readiness to blush. We cannot 
conveniently drop him into a social situation to 
observe whether he blushes, so we do the next 
best (and often much worse) thing and simply 
ask him, “Do you blush easily?” We assume 
that if he does in fact blush easily, he will realize 
that fact about himself, which is often a gratuitous 
assumption; and secondly, we hope that having 
recognized it, he will be willing to tell us so. 

Associated with this approach to structured 
personality tests in the construction of items and 
their assembling into scales upon an a priori 
basis, requiring the assumption that the psychol- 


ogist building the test has sufficient insight into 
the dynamics of verbal behavior and its relation 
to the inner core of personality that he is able 
to predict beforehand what certain sorts of people 
will say about themselves when asked certain 
sorts of questions. The fallacious character of 
this procedure has been sufficiently shown by the 
empirical results of the Minnesota Multiphasic 
Personality Inventory alone, and will be discussed 
at greater length below. It is suggested tenta- 
tively that the relative uselessness of most struc- 
tured personality tests is due more to a priori 
item construction than to the fact of their being 
structured. 

The second approach to verbal self-ratings is 
rarer among test makers. It consists simply in 
the explicit denial that we accept a self-rating as 
a feeble surrogate for a behavior sample, and 
substitutes the assertion that a “self-rating” con- 
stitutes an intrinsically interesting and significant 
bit of verbal behavior, the non-test correlates of 
which must be discovered by empirical means. 
Not only is this approach free from the restric- 
tion that the subject must be able to describe his 
own behavior accurately, but a careful study of 
structured personality tests built On this basis 
shows that such a restriction would falsify the 
actual relationships that hold between what a 
man says and what he is. 

Since this view of question-answer items is the 
rarer one at the present time, it is desirable at this 
point to elucidate by a number of examples. For 
this purpose one might consider the Strong Voca- 
tional Interest Blank, the Humm-Wadsworth Tem- 
perament Scales, the Minnesota Multiphasic Per- 
sonality Inventory, or any structured personality 
measuring device in which the selection of items 
was done on a thoroughly empirical basis using 
carefully selected criterion groups. In the exten- 
sive and confident use of the Strong Vocational 
Interest Blank, this more sophisticated view of 
the significance of responses to structured per- 
sonality test items has been taken as a matter of 
course for years. The possibility of conscious 
as well as unconscious “fudging” has been con- 
sidered and experimentally investigated by Strong 
and others, but the differences in possible inter- 
pretation or meaning of items have been more 
or less ignored—as well they should be. One is 
asked to indicate, for example, whether he likes, 
dislikes, or is indifferent to “conservative people.” 
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The possibilities for differential interpretation of 
a word like conservative are of course tremendous, 
but nobody has worried about that problem in 
the case of the Strong. Almost certainly the 
strength of verbs like “like” and “dislike” is 
variably interpreted throughout the whole blank. 
For the-present purpose the Multiphasic (referred 
to hereinafter as MMPI) will be employed be- 
cause the present writer is most familiar with it. 

One of the items on the MMPI scale for de- 
tecting psychopathic personality (Pd) is “My 
parents and family find more fault with me than 
they should.” If we look upon this as a rating 
in which the fact indicated by an affirmative re- 
sponse is crucial, we immediately begin to wonder 
whether the testee can objectively evaluate how 
much other people’s parents find fault with them, 
whether his own parents are warranted in finding 
as much fault with him as they do, whether this 
particular subject will interpret the phrase “find- 
ing fault” in the way we intend or in the way 
most normal persons interpret it, and so on. 
The present view is that this is simply an unprofit- 
able way to examine a question-answer personality 
test item. To begin with, the empirical finding is 
that individuals whose past history and momen- 
tary clinical picture is that of a typical psycho- 
pathic personality tend to say “Yes” to this much 
more often than people in general do. Now in 
point of fact, they probably should say “No” be- 
cause the parents of psychopaths are sorely tried 
and probably do not find fault with their incor- 
rigible offspring any more than the latter deserve. 
An allied item is “I have been quite independent 
and free from family rule” which psychopaths 
tend to answer false—almost certainly opposite 
to what is actually the case for the great majority 
of them. Again, “Much of the time I feel I have 
done something wrong or evil.” Anyone who 
deals clinically with psychopaths comes to doubt 
seriously whether they could possibly interpret 
this item in the way the rest of us do (ef. 
Cleckley’s (2) “semantic dementia”), but they 
say that about themselves nonetheless. Numer- 
ous other examples such as “Someone has it in 
for me” and “I am sure I get a raw deal from 
life” appear on the same scale and are significant 
because psychopaths tend to say certain things 
about themselves, rather than because we take 
these statements at face value. 

Consider the MMPI scale for detecting tend- 


encies to hypochondriasis. A hypochondriac says 
that he has headaches often, that he is not in as 
good health as his friends are, and that he can- 
not understand what he reads as well as he used 
to. Suppose that he has a headache on an aver- 
age of once every month, as does a certain “nor- 
mal” person. The hypochondriac says he often 
has headaches, the other person says he does 
not. They both have headaches once a month, 
and hence they must either interpret the word 
“often” differently in that question, or else have 
unequal recall of their headaches. According to 
the traditional view, this ambiguity in the word 
“often” and the inaccuracy of human memory 
constitute sources of error; for the authors of the 
MMPI they may actually constitute sources of 
discrimination. 

We might mention as beautiful illustrations of 
this kind of relation, the non-somatic items in the 
hysteria scale of MMPI (8). These items have 
a statistical homogeneity and the common prop- 
erty by face inspection that they indicate the per- 
son to be possessed of unusually good social and 
psychiatric adjustment. They are among the 
most potent items for the detection of hysterics 
and hysteroid temperaments, but they reflect the 
systematic distortion of the hysteric’s conception 
of himself, and would have to be considered in- 
valid if taken as surrogates for the direct observa- 
tion of behavior. 

As a last example one might mention some 
findings of the writer, to be published shortly, in 
which “normal” persons having rather abnormal 
MMPI profiles are differentiated from clearly 
“abnormal” persons with equally deviant profiles 
by a tendency to give statistically rare as well as 
psychiatrically “maladjusted” responses to cer- 
tain other items. Thus a person who says that 
he is afraid of fire, that windstorms terrify him, 
that people often disappoint him, stands a better 
chance of being normal in his non-test behavior 
than a person’ who does not admit to these things. 
The discrimination of this set of items for various 
criterion groups, the intercorrelations with other 
scales, and the content of the items indicate 
strongly that they detect some verbal-semantic 
distortion in the interpretation and response to 
the other MMPI items which enters into the 
spurious elevation of scores achieved by certain 
“normals.” Recent unpublished research on more 
subtle “lie” scales of the MMPI indicates that un- 
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conscious self-deception is inversely related to the 
kind of verbal distortion just indicated. 

In summary, a serious and detailed study of the 
MMPI items and their interrelations both with 
one another and non-test behavior cannot fail to 
convince one of the necessity for this second kind 
of approach to question-answer personality tests. 
That the majority of the questions seem by in- 
spection to require self-ratings has been a source 
of theoretical misunderstanding, since the stimu- 
lus situation seems to request a self-rating, where- 
as the scoring does not assume a valid self-rating 
to have been given. It is difficult to give any 
psychologically meaningful interpretation of some 
of the empirical findings on the MMPI unless the 
more sophisticated view is maintained. 

It is for this reason that the possible differences 
in interpretation do not cause us any a priori 
concern in the use of this instrument. Whether 
any structured personality test turns out to be 
valid and useful must be decided on pragmatic 
grounds, but the possibility of diverse interpreta- 
tions of a single item is not a good theoretical 
reason for predicting failure of the scales. There 
is a “projective” element involved in interpreting 
and responding to these verbal stimuli which 
must be recognized, in spite of the fact that the 
test situation is very rigidly structured as regards 
the ultimate response possibilities permitted. The 
objection that all persons do not interpret struc- 
tured test items in the same way is not fatal, 
just as it would not be fatal to point out that 
“ink blots do not look the same to everyone.” 

It has not been sufficiently recognized by critics 
of structured personality tests that what a man 
says about himself may be a highly significant 
fact about him even though we do not entertain 
with any confidence the hypothesis that what he 
says would agree with what complete knowledge 
of him would lead others to say of him. It is 
rather strange that this point is so often com- 
pletely passed by, when clinical psychologists 
quickly learn to take just that attitude in a diag- 
nostic or therapeutic interview. The complex 
defense mechanisms of projection, rationalization, 
reaction-formation, etc., appear dynamically to 
the interviewer as soon as he begins to take what 
the client says as itself motivated by other needs 
than those of giving an accurate verbal report. 
There is no good a priori reason for denying the 
possibility of similar processes in the highly struc- 


tured “interview” which is the question-answer 
personality test. The summarized experience of 
the clinician results (one hopes, at least) in his 
being able to discriminate verbal responses ad- 
missible as accurate self-descriptions from those 
which reflect other psychodynamisms but are not 
on that account any the less significant. The 
test analogue to this experience consists of the 
summarized statistics on response frequencies, at 
least among those personality tests which have 
been constructed empirically (MMPI, Strong, 
Rorschach, etc.). 

Once this has been taken for granted we are 
prepared to admit powerful items to personality 
scales regardless of whether the rationale of their 
appearance can be made clear at present. We 
do not have the confidence of the traditional per- 
sonality test maker that the relation between the 
behavior dynamics of a subject and the tendency 
to respond verbally in a certain way must be 
psychologically obvious. Thus it puzzles us but 
does not disconcert us when this relation cannot 
be elucidated, the science of behavior being in the 
stage that it is. That “I sometimes tease ani- 
mals” (answered false) should occur in a scale 
measuring symptomatic depression is theoretically 
mysterious, just as the tendency of certain schizo- 
phrenic patients to accept “position” as a de 
terminant in responding to the Rorschach may 
be theoretically mysterious. Whether such a re- 
lation obtains can be very readily discovered 
empirically, and the wherefore of it may be left 
aside for the moment as a theoretical question. 
Verbal responses which do not apparently have 
any self-reference at all, but in their form seem 
to request an objective judgment about social 
phenomena or ethical values, may be equally 
diagnostic. So, again, one is not disturbed to 
find items such as “I think most people would lie 
to get ahead” (answered false) and “It takes a 
lot of argument to convince most people of the 
truth” (answered false) appearing on the hysteria 
scale of the MMPI. 

The frequently alleged “superficiality” of struc- 
tured personality tests becomes less evident on 
such a basis also. Some of these items can be 
Tationalized in terms of fairly deep-seated trends 
of the personality, although it is admittedly diffi- 
cult to establish that any given depth interpre 
tation is the correct one. To take one example, 
the items on the MMPI scale for hysteria which 


ee 
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were referred to above as indicating extraordinar- 
ily good social and emotional adjustment can 
hardly be seen as valid self-descriptions. How- 
ever, if the core trend of such items is summar- 
ily characterized as “I am psychiatrically and 
socially well adjusted,” it is not hard to fit such 
a trend into what we know of the basic person- 
ality structure of the hysteric. The well-known 
belle indifference of these patients, the great lack 
of insight, the facility of repression and dissocia- 
tion, the “impunitiveness” of their reactions to 
frustration, the tendency of such patients to show 
an elevated “lie” score on the MMPI, may all be 
seen as facets of this underlying structure. It 
would be interesting to see experimentally whether 
to the three elements of Rosenzweig’s “triadic hy- 
pothesis” (impunitiveness, repression, hypnotiza- 
bility) one might add a fourth correlate—the 
chief non-somatic component of the MMPI hys- 
teria scale. 

Whether “depth” is plumbed by a structured 
personality test to a lesser extent than by one 
which is unstructured is difficult to determine, 
once the present view of the nature of structured 
tests is understood. That the “deepest” layers of 
personality are not verbal might be admitted with- 
out any implication that they cannot therefore 
make themselves known to us via verbal behavior. 
Psychoanalysis, usually considered the “deepest” 
kind of psychotherapy, makes use of the depend- 
ency of verbal behavior upon underlying vari- 
ables which are not themselves verbalized. 

The most important area of behavior consid- 
ered in the making of psychiatric diagnosis is 
still the form and content of the speech of the 
individual. I do not mean to advance these con- 
siderations as validations of any structured per- 
sonality tests, but merely as reasons for not ac- 
cepting the theoretical objection sometimes of- 
fered in criticizing them. Of course, structured 
personality tests may be employed in a purely 
diagnostic, categorizing fashion, without the use 
of any dynamic interpretations of the relationship 
among scales or the patterning of a profile. For 
certain practical purposes this is quite permis- 
sible, just as one may devote himself to the statis- 
tical validation of various “signs” on the Ror- 
schach test, with no attempt to make qualitative 
or really dynamic personological inferences from 
the findings. The tradition in the case of struc- 
tured personality tests is probably weighted on 


the side of non-dynamic thinking; and in the case 
of some structured tests, there is a considerable 
amount of experience and clinical subtlety re- 
quired to extract the maximum of information. 
The present writer has heard discussions in case 
conferences at the University of Minnesota Hos- 
pital which make as “dynamic” use of MMPI 
patterns as one could reasonably make of any 
kind of test data without an excessive amount of 
illegitimate reification. The clinical use of the 
Strong Vocational Interest Blank is another ex- 
ample. 

In discussing the “depth” of interpretation pos- 
sible with tests of various kinds, it should at least 
be pointed out that the problem of validating per- 
sonality tests, whether structured or unstructured, 
becomes more difficult in proportion as the in- 
terpretations increase in “depth.” For example, 
the validation of the “sign” differentials on the 
Rorschach is relatively easier to carry out than 
that of the deeper interpretations concerning the 
basic personality structure. This does not imply 
that there is necessarily less validity in the latter 
class of inferences, but simply stresses the diffi- 
culty of designing experiments to test validity. 
A very major part of this difficulty hinges upon 
the lack of satisfactory external criteria, a situa- 
tion which exists also in the case of more dynamic 
interpretations of structured personality tests. 
One is willing to accept a staff diagnosis of psy- 
chasthenia in selecting cases against which to 
validate the Pt scale of the MMPI or the F% as a 
compulsive-obsessive sign on the Rorschach. But 
when the test results indicate repressed homo- 
sexuality or latent anxiety or lack of deep insight 
into the self, we may have strong suspicions that 
the instrument is fully as competent as the psy- 
chiatric staff. Unfortunately this latter assump- 
tion is very difficult to justify without appearing 
to be inordinately biased in favor of our test. 
Until this problem is better solved than at pres- 
ent, many of the “depth” interpretations of both 
structured and unstructured tests will be little 
more than an expression of personal opinion. 

There is one advantage of unstructured per- 
sonality tests which cannot easily be claimed for 
the structured variety, namely, the fact that false- 
hood is difficult. While it is true for many of the 
MMPI items, for example, that even a psychol- 
ogist cannot predict on which scales they will 
appear nor in what direction certain sorts of 
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abnormals will tend to answer them, still the 
relative accessibility of defensive answering would 
seem to be greater than is possible in responding 
to a set of ink-blots. Research is still in progress 
on more subtle “lie” scales of the MMPI and we 
have every reason to feel encouraged on the pres- 
ent findings. Nevertheless the very existence of 
a definite problem in this case and not in the case 
of the Rorschach gives the latter an advantage in 
this respect. When we pass to a more structured 
method, such as the T.A.T., the problem re- 
appears. The writer has found, for example, a 
number of patients who simply were not fooled 
by the “intelligence-test” set given in the direc- 
tions for the T.A.T., as was indicated quite 
clearly by self-references and defensive remarks, 
especially on the second day. Of course such a 
patient is still under pressure to produce material 
and therefore his unwillingness to reveal himself 
is limited in its power over the projections finally 
given. 

In conclusion, the writer is in hearty agree- 
ment with Lieutenant Hutt that unstructured per- 
sonality tests are of great value, and that the final 
test of the adequacy of any technique is its utility 
in clinical work. Published evidence of the valid- 
ity of both structured and unstructured personality 
tests as they had to be modified for convenient 
military use does not enable one to draw any 
very definite conclusions or comparisons at the 
present time. There is assuredly no reason for 
us to place structured and unstructured types of 
instruments in battle order against one another, 
although it is admitted that when time is limited 
they come inevitably into a very real clinical 
“competition” for use. The present article has 
been aimed simply at the clarification of certain 
rather prevalent misconceptions as to the nature 
and the theory of at least one important struc- 
tured personality test, in order that erroneous 
theoretical considerations may not be thrown into 
the balance in deciding the outcome of such clin- 
ical competition. 
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THE RELATIONSHIP BETWEEN THE 
JUDGED DESIRABILITY OF A TRAIT 
AND THE PROBABILITY THAT THE 

TRAIT WILL BE ENDORSED * 


ALLEN L. EDWARDS + 


There is a rather common suspicion among 
many psychologists that subjects tend to give 
what are considered to be socially desirable re- 
sponses to items in personality inventories. This 
suspicion has been given public expression in a 
recent article by Gordon (3, p. 407) who com- 
ments upon “. . . the motivation of a majority of 
respondents to mark socially acceptable alterna- 
tives to items, rather than those which they be- 
lieve apply to themselves.” 

We have here two problems. One concerns 
the truthfulness of a subject’s answers to items 
in a personality inventory, i.e., whether the re- 
sponse accurately describes the subject. The an- 
swer to this question implies that we have avail- 
able some independent criterion in terms of which 
the inventory response is to be evaluated. The 
other problem concerns the relationship between 
a subject’s response to an item and the social 
desirability of that item, i.e., whether the subject 
tends to give a positive answer to an item that 
is socially desirable and a negative answer to an 
item that is not. The answer to this question 
implies that we have available some measure of 
the social desirability of the item to which the 


* Reprinted by permission from The Journal of 
Applied Psychology, April, 1953, Vol. 37, No. 2, 
90-93. 

1 This paper was presented before the Western Psy- 
chological Association, Fresno, California, April 26, 
1952. It is part of a research program made possible 
by an appointment as a Faculty Research Fellow of 
the Social Science Research Council. 


response can be related. It is this problem we 
wish to report upon here. 


THE PRESENT STUDY 


The hypothesis to be investigated may be stated 
in this way: If the behavior indicated by an in- 
ventory item is socially desirable, the subject will 
tend to attribute it to himself; if it is undesirable, 
he will not. This hypothesis may be put more 
precisely: The probability of endorsement of per- 
sonality items is a monotonic increasing function 
of the scaled social desirability of the items. 

To study the relationship between the prob- 
ability of endorsement of personality trait items 
and the social desirability of the items requires 
that we determine independently two measures: 
the probability of endorsement and the social de- 
sirability scale value of the items. This study 
thus consists of two parts: in the first, the scale 
values of the items are determined; in the sec- 
ond, the probability of endorsement is related to 
the independently determined scale values. 


DETERMINING THE SCALE VALUES 


A total of 140 personality trait items, based 
upon Murray’s (4) discussion of needs, were writ- 
ten and edited. The items were selected so that 
14 needs were investigated with 10 items sup- 
posedly indicative of each need. The items were 
arranged in 10 sets of 14 items each, so that each 
set consisted of one item relating to each of the 
needs. 

The items were presented to subjects with in- 
structions to judge the degree of social desirabil- 
ity of the behavior indicated by each item in 
terms of how the behavior would be regarded in 
others. Judgments were made in terms of nine 
successive intervals, with the lowest interval rep- 
resenting extreme undesirability and the highest 
extreme desirability. The rating system was ex- 
plained in terms of a sample set of four items 
for which judgments had already been obtained. 
After these ratings had been discussed, the in- 
structions to the subjects concluded with the fol- 
lowing statement: 

“Indicate your own judgments of the desirabil- 
ity or undesirability of the traits which will be 
given to you by the examiner in the same manner. 
Remember that you are to judge the traits in terms 
of whether you consider them desirable or un- 
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desirable in others. Be sure to make a judgment 
about each trait.” 

The subjects judging the desirability of the 
items consisted of 86 men and 66 women, a total 
of 152 subjects. Twenty-six of the subjects were 
under 20 years of age, 97 were between 20 and 
30 years of age, and 29 were over 30 years of age. 

Cumulative distributions of the judgments were 
made separately by age and by sex groups. For 


Women 


Men 


Fic. 1. Interval in which the median of the 
women’s distribution of judgments would fall 
plotted against the interval in which the median 
of the men’s distribution of judgments would fall. 


each item we then found the interval in which 
the median of the distribution of judgments would 
fall. 

In Fig. 1, we show the plot of the women’s 
intervals against the corresponding values for the 
men. It may be noted that in the case of only 
two items would the medians be separated by as 
much as two intervals. For 43 of the items the 
medians might possibly be separated by as much 
as one interval. For the remaining 95 items the 
medians would all fall within the same interval. 

In the case of many of the items falling out- 
side the principal diagonal of Figure 1, the me- 
dians would still be approximately the same for 
the reason that the medians of both distributions 
are close to the limit of the interval, but one hap- 
pens to fall slightly above and the other slightly 
below the limit. 


A similar analysis of the judgments was made 
in terms of the age variable. Examination of the 
separate distributions indicated that the scale 
values that would thus be obtained would be 
comparable and that little distortion would be in- 
troduced by pooling the judgments for all groups. 

On the basis of the combined distributions, the 
scale values of the 140 items were found. The 
scale values were determined by the method of 
successive intervals (1). This method of scaling 
does not involve any assumption of equality of 
the successive rating intervals. 

After determining the widths of the successive 
intervals and the scale values of the items on the 
psychological continuum of social desirability, an 
internal consistency test was applied (/). Using 
the 147 parameters calculated from the data, it 
was possible to reproduce the 1,120 independent, 
empirical observations with an average error of 
.023. This value, it may be mentioned, compares 
favorably with that usually obtained from inter- 
nal consistency tests used when stimuli are scaled 
by the method of paired comparisons. 


RELATIONSHIP BETWEEN SCALE VALUES 
AND PROBABILITY OF ENDORSEMENT 


In the second part of this study, a sample of 
140 pre-medical and pre-dental students re- 
sponded to the same set of items for which we 
had previously determined the scale values on the 
psychological continuum of social desirability. 
This time, however, the items appeared in a 
printed form as a personality inventory. The 
inventory was part of a test battery which was 
administered for the Medical and Dental Schools 
of the University of Washington, The instruc- 
tions were those that are commonly used with 
personality inventories. A “Yes” response indi- 
cated that the subject believed that a given item 
was characteristic of himself and a “No” response 
that it was not. 

Item counts were made for each item, by 
means of IBM equipment, and the per cent re- 
sponding “Yes” was then found for each item. 
This per cent is the proportion of the sample 
indicating that the behavior stated by a particular 
item is characteristic of themselves. The pro- 
portions may be taken as the probability of en- 
dorsement of a particular trait item for the sam- 
ple at hand. 
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Fic. 2. Probability of endorsement of a trait item plotted against the social desirability 
scale value of the item. The product-moment correlation coefficient is 871. 


The probability of endorsement of each item 
was plotted against the previously, and independ- 
ently, determined social desirability scale value 
of the item. This plot is shown in Fig. 2. On 
the Y-axis we have the probability of endorse- 
ment and on the X-axis the social desirability 
scale value. It is apparent that the probability 
of endorsement is a linear function of the scaled 
desirability of the item.? The product-moment 
correlation coefficient is .871. 


DISCUSSION 


The data clearly indicate that the probability 
of endorsement of an item increases with the 
judged desirability of the item. This does not 
necessarily mean that the subjects are misrepre- 
senting themselves on the inventory. It may be 
that traits which are judged as desirable are those 
which are fairly widespread or common among 


2There is a slight indication of departure from 
linearity at the two extremes of the scale value axis. 
This is probably because of the limit placed upon the 
plotted points in terms of the Y-axis. The departure 
from linearity, however, is not statistically significant. 


members of a culture or group. That is, if a 
pattern of behavior is prevalent among members 
of a group, it will be judged as desirable; if it is 
uncommon, it will be judged as undesirable. We 
might thus expect items indicating desirable traits 
to be endorsed more frequently than items in- 
dicating undesirable traits. 

It is also possible that the behavior indicated 
by an item with a high social desirability scale 
value is not common, but that the subject taking 
the inventory is trying, consciously or uncon- 
sciously, to give a good impression of himself. 
He therefore tends to distort his answers in such 
a way as to make himself out as having more 
of the socially desirable traits and fewer of the 
socially undesirable traits than might be the case 
if his behavior were evaluated in terms of some 
other independent criterion. 

Either one or both of the interpretations pre- 
sented would account for the relationship be- 
tween probability of endorsement and scaled de- 
sirability of the item. I have no data to support 
the interpretation that the subjects misrepresented 
themselves on the inventory, but Ellis (2) in his 
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recent review cites quite a few studies which 
would indicate that this is the case. 

I£ this is true, then in a personality inventory 
we should attempt to minimize the tendency for 
a given response to be determined primarily by 
the factor of social desirability. A suggested 
solution is to pair items indicative of different 
traits in terms of their social desirability scale 
values. If the subject is then forced to choose 
between the two items, his choice obviously can- 
not be upon the basis of the greater social desir- 
ability of one of the items. 
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CONTENT AND STYLE IN 
PERSONALITY ASSESSMENT * 


DoucLas N. JACKSON AND SAMUEL MESSICK * 


In personality theory a ubiquitous and funda- 
mental distinction may be drawn between the in- 
terpretation of behavior in terms of (a) the con- 
fent of “needs” and of cognitive structures gen- 
erally and in terms of (b) characteristic styles 
of response and action. The separation of these 


* Reprinted by permission from the Psychological 
Bulletin, 1958, Vol. 55, No. 4, 243-252. 

1 Portions of this paper were read at a symposium 
on “Experimental Approaches to Personality Assess- 
ment” at the American Psychological Association 
Meetings in New York, 1957. 

The authors express their thanks to Lee Sechrest 
and Riley W. Gardner for commenting on the content 
and style of the manuscript. 


two components of personality organization has 
taken a variety of forms in the hands of different 
theorists, as in the Allport-Vernon (2) Studies 
in Expressive Movements, in Murphy's (47) 
scholarly discussion of continuity in personality 
structure, in Klein’s (40) distinction between 
needs and control processes, and in Vernon’s (54) 
distinction between adaptive and expressive be- 
havior. One may legitimately ask not only what 
a person says or does (the particular content of 
his statements and actions) but how he acts (his 
characteristic mode or style of expression). 

What is conceptually a relatively sharp distinc- 
tion is typically blurred and confounded in a par- 
ticular concrete act; the what and how are fused 
in a given goal-directed response. An obsequious 
person indicates his deference not only by the 
act of yielding, but by the tone of his voice in 
performing the yielding act. Because content 
and style are intermixed in a given behavior se- 
quence, and because there is often a theoretical 
predilection for content components, style is often 
overlooked in personality assessment. Also, the 
measurement of content appears to be more di- 
rect and unambiguous than the assessment of 
stylistic dimensions of personality. It is possible, 
for example, to ask a person what his attitude is 
on a given topic, or to draw inferences about his 
need patterns from his reported likes and dislikes 
(51). The obviousness of such devices, while 
helpful from the viewpoint of labeling what one 
hopes one is measuring, also permits respondents 
to distort their scores if they so desire (32), some- 
thing which is less likely to occur in the assess- 
ment of style. 

In considering the general distinction between 
content and style, those methods of personality 
and attitude assessment which are based upon 
printed questionnaires of one form or another 
will be emphasized. While the complementary 
constructs of content and style have special rele- 
vance to questionnaire items, where the response- 
evoking properties of the particular item form 
may contribute markedly to response variance 
above and beyond the contribution of content, 
the distinction might also be applied usefully to 
other areas of personality assessment. For ex- 
ample, three possible applications are to per- 
ceptual and cognitive style as in the work of 
Thurstone (52), Witkin (58), Klein (39, 40), 
Gardner (27), and others (34); to achievement 
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and aptitude testing (28, 32, 60); and to the per- 
ception of personality (2, 38, 54, 59). 

The present discussion attempts to do two 
things: first, to present some evidence showing 
the important and subtle influences upon re- 
sponses of stylistic components of item form; 
and, second, to illustrate how reliable measures 
of potentially useful stylistic dimensions may be 
generated from characteristic responses to the 
form of personality and attitude items as distinct 
from measures of content. 


PERSONALITY STYLE AND RESPONSE SET 


Traditionally, responses to a particular item 
or set of items are assumed to provide informa- 
tion about the respondent in terms of the item 
content. If, for example, a person agrees with 
the statement, “Under no conditions is war justi- 
fied,” or answers “true” to the item, “I have more 
trouble concentrating than others seem to have,” 
it is commonly assumed that these responses, if 
consistent, will indicate respectively something 
about the person’s attitude toward war or his men- 
tal state. Under these conditions response de- 
terminants such as the subjects’ generalized tend- 
ency to agree are legitimately considered as 
sources of cumulative error, Cronbach’s (J3, 
14) familiar “response sets.” While Cronbach’s 
emphasis was that response sets often lead to er- 
rors of interpretation in the logical validity of 
tests, he also indicated that these response tend- 
encies might not always be temporary and trivial, 
but may have a stable and valid component which 
reflects a consistent individual style or personal- 
ity trait. While recognizing Cronbach’s contri- 
bution in describing the phenomenon, it is prefer- 
able for the present purposes to change the label 
from “response set” to components of style. This 
change in terms emphasizes the fact that for cer- 
tain purposes in personality assessment oppor- 
tunities for the expression of personal modes for 
responding should be enhanced and capitalized 
upon, rather than considered as sources of error 
to be avoided or minimized. This change also 
avoids the ambiguity inherent in the concept of 
“set” (22). 


CHARACTERISTIC STYLES IN PERSONALITY 
AND ATTITUDE QUESTIONNAIRES 


Among the more prominent response styles 
usually evoked by questionnaire items are re- 


sponse acquiescence, overgeneralization, a tend- 
ency to respond in a socially desirable way, and 
the complementary tendencies to respond nega- 
tivistically, critically, and in a socially undesir- 
able or idiosyncratic manner. Some pertinent il- 
lustrations will be drawn of how each of these, 
operating singly and in combination, may influ- 
ence the interpretation of responses to psycho- 
logical tests. Alternative procedures for evaluat- 
ing these stylistic variables will then be discussed. 

Response Acquiescence and Authoritarianism. 
—It has long been recognized that a subject who 
agrees with a personality or attitude item stated 
in a positive form may not necessarily disagree 
with its logical opposite, but may instead show a 
fairly general tendency toward agreement or dis- 
agreement. Studies by Rundquist and Sletto 
(49), by Lorge (42), and reviews by Cronbach 
(13, 14), Berg (8), and Messick and Jackson 
(45), indicate that response acquiescence is wide- 
spread and pervasive over a wide variety of item 
content and most pronounced when content is 
highly ambiguous or imaginary. Berg (8, 9) has 
suggested that acquiescence is a modal response 
in our culture when the issue before the respond- 
ents is unimportant or nonexistent. 

The operation of such stylistic tendencies 
should be taken into account in the course of 
personality measurement. If a particular content 
area is to be assessed, it is at least necessary to 
introduce into the scaling procedure appropriate 
experimental controls for acquiescence, or else 
reconcile oneself to interpretive equivocality due 
to the confounding of content and style in a single 
measure. Other response determinants besides 
acquiescence, however, must be controlled before 
characteristics may be unequivocally attributed 
to respondents on the basis of item content. Mes- 
sick and Jackson (45) have discussed alternative 
methods for reducing this ambiguity of interpre- 
tation in the measurement of authoritarian atti- 
tudes. 


2 Gage, Leavitt, and Stone (20) have argued that 
confounding content and style in the F scale, far 
from being a source of error, is fortunate, because 
acquiescence contributes to the empirical validity of 
the F scale as assessed by independent ratings of 
authoritarian behavior. If the aim is merely to pre- 
dict authoritarianism as a criterion, like predicting the 
success of salesmen, this argument might be iegitimate 
as long as the criterion did not change. But if one 
hopes to understand the various components of a 
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Even though much of the recent research with 
the California F scale (J) has been of a methodo- 
logical and critical nature, it nevertheless yields 
some important information on the relationship 
between content and style. A number of investi- 
gators (5, 46, 36, 37, 41, 45) have independently 
correlated scores based on the California F scale, 
in which all of the items are so worded that agree- 
ment is always scored in the authoritarian direc- 
tion, with scores based on logically reversed F- 
scale items. These correlations were not found 
to be high and negative, as would be expected 
from consistent responses to item content. With 
one reversed F scale (36), significant positive 
correlations in the acquiescence rather than the 
content direction were obtained. Furthermore, 
there is evidence (37) that previously obtained 
relationships between personality variables and 
the F scale, formerly thought to be interpretable 
in terms of correlates of authoritarian ideology 
or content, may need reinterpretation in terms of 
consistencies in style. The most recent study 
requiring such reinterpretation is one by Gilbert 
and Levinson (23), in which a scale purportedly 
measuring “custodial mental illness ideology” was 
constructed, with 17 of 20 items requiring agree- 
ment to be scored as “custodial ideology.” A 
high correlation between the “custodial ideology” 
scale and the F scale was used to support the 
conclusion that “preference for a custodialistic 
orientation is part of a broader pattern of per- 
sonal authoritarianism.” But Howard and Som- 
mer in a replication ® found that “custodialism” 
correlated significantly with agreements to both 
the original and the Jackson-Messick (36) re- 
versed F scales, which would seem to indicate 
that style rather than content is of primary im- 
portance in this instance. 

Christie, Havel, and Seidenberg (12) have 
shown that it is possible in some samples to ob- 
tain a correlation between reversed and original 
F-scale items significant in the content direction. 
Jackson, Messick, and Solley (37) had previously 
reported a correlation of +.35 between agree- 
ments to original and to reversed F-scale items. 
dynamic construct like authoritarianism, conglomer- 
ate indices containing both content and style will not 
suffice and will confuse the issues (45). 

8 Howard, T. W., and Sommer, R. “A Critical Ex- 
amination of ‘Ideology, Personality, and Institutional 


Policy in the Mental Hospital.’” Unpublished manu- 
script. 


What accounts for this apparently considerable 
discrepancy? One set of investigators predicted 
and obtained a correlation significant in the 
acquiescence direction, while another, with a 
different reversed F scale, predicted and obtained 
a correlation in the content direction. The an- 
swer to this question requires a consideration of 
more than differences in the content of the two 
reversed F scales; the form of the items must be 
examined. Jackson and Messick (36) indicated 
that the original, extremely worded, cliché-ridden 
style of the F scale was retained in their reversals, 
while Christie, Havel, and Seidenberg (12) ex- 
plicitly avoided the sweeping generalizations found 
in the originals and substituted much more cau- 
tious statements. It is likely that this difference 
in item form accounts for the different results of 
the two sets of investigators. It appears that the 
tendency to endorse statements containing phrases 
such as “every person,” “no person,” “all,” “most 
important,” “complete certainty,” “never,” “must,” 
etc., is a general one, which may act independ- 
ently of the content. This response style to 
overgeneralize may contribute to relationships be- 
tween the F scale and cognitive variables like 
rigidity (37) and perceptual intolerance for am- 
biguity (78). It probably also partially accounts 
for the frequent observation that verbally elicited 
ethnic attitudes tend to be highly intercorrelated 
(10), even, for example, in Hartley’s (30) study 
where the “groups” were nonexistent and no 
previous attitude or “cognitive structure” could 
be assumed to exist. An appraisal of variance 
associated with aspects of authoritarian content 
on one hand, and stylistic components like re- 
sponse acquiescence and overgeneralization on 
the other, would seem to require at least four 
sets of items: an extremely worded original and 
reversed F scale and a probabilistic original and 
reversed F scale. It is suspected that subjects 
endorsing probabilistic F-scale items would not 
show as much of the “authoritarian’s” intolerance 
for ambiguity as might be expected, although 
some relationship between authoritarian ideology 
and response style might still be obtained. 
Response Acquiescence in Personality Inven- 
tories —The distinct roles of content and style 
should also be noted in responses to personality 
inventories, especially those “true-false” devices 
like the MMPI developed by the empirical selec- 
tion of discriminating items. While few, if any, 
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investigators have ever explicitly assumed that 
the total number of empirically derived scales 
was the most parsimonious way of summarizing 
the common variance of an inventory, the use 
of a large number of separate scales as, for ex- 
ample, in the 9 clinical scales of the MMPI or 
the 18 scales of Gough’s California Psychological 
Inventory, is justified by the extent to which each 
makes some independent contribution to the as- 
sessment problem not made by the other scales.* 
If there is a great deal of common variance among 
the various scales, this redundancy limits their 
efficiency. 

There is considerable evidence that a very few 
factors account for the major proportion of the 
variance on personality inventories of the “true- 
false” variety. Wheeler, Little, and Lehner (57), 
for example, reported a factor analysis of MMPI 
scales in which only two major factors and one 
minor factor were identified. In the light of 
accumulating evidence it seems likely that the 
major common factors in personality inventories 
of the true-false or agree-disagree type, such as 
the MMPI and the California Psychological In- 
ventory, are interpretable primarily in terms of 
style rather than specific item content. 

One line of departure from which it is possible 
to evaluate the role of acquiescence in personality 
inventories is to consider the percentage of items 
keyed “true” in each scale as an index of the 
extent to which that scale elicits response ac- 
quiescence. Jackson (33) did this with the Cali- 
fornia Psychological Inventory, computing rank 
order correlations between the percentage “true” 
in each scale and the scale’s correlation with out- 
side personality measures shown previously to 
reflect acquiescence. A number of high and 
significant correlations with such unidirectional 
scales as the California F scale and the MMPI 
K scale suggests strongly that acquiescence is a 
major source of variance in the CPI. 


‘The MMPI was advanced initially as an aid in 
the prediction of psychiatric diagnoses. In practice 
it is rarely so used in any literal sense, which is for- 
tunate, as the research evidence (e.g., 7, 48) indicates 
that predictions of specific diagnoses generally cannot 
be made with certainty. Rather, the original purpose 
of the MMPI, prediction, has come to be modified so 
that now scores, singly or in combination, are used to 
draw inferences about characteristics of respondents 
(56). Somewhat different notions of validity (15) 
and a different mathematical model (27, 53) are nec- 
essary in the latter case. 


Messick and Jackson 5 have obtained evidence 
of a similar nature for the MMPI. They ob- 
tained rank order correlations in the seventies 
between each scale’s percentage “true” and its 
loading on the first factor as reported in each of 
several factor analytic studies. Preliminary re- 
sults suggest that the first factor of the MMPI is 
interpretable in terms of acquiescence. Equally 
striking is a recent factor analytic study by Welsh 
(55), who sought to obtain’pure-factor MMPI 
scales through a variant of the internal con- 
sistency method. He was rather successful in 
developing two such scales, labeled A (for anx- 
iety) and R (for repression), which loaded 
highly on the first and second factors, respec- 
tively. The remarkable thing about these scales 
is that all but one of the 39 items measuring the 
first factor are keyed “true,” while all 40 items 
for the second pure factor scale are keyed 
“false.” Even though Welsh’s two scales are 
predominantly unidirectional, one in the “true,” 
and the other in the “false” direction, they yield 
only low negative correlations with each other. 
This would lead one properly to reject the notion 
that a simple response set was sufficient to ac- 
count for all of the variance in the two scales. 
Nevertheless, each scale does seem to have an 
acquiescence component, for such a distribution 
of “true” and “false” items would be unlikely to 
occur by chance, and Jackson (33) has shown 
that correlations based on both scales correlated 
significantly with percentage keyed “true” in each 
CPI scale. Thus, careful consideration must be 
given to the possibility that response acquiescence 
is interacting with another variable, either of con- 
tent or of style, and that responses are determined 
in part by this interaction, As with the F scale, 
where acquiescence operates most strikingly in 
conjunction with statements in the form of sweep- 
ing generalizations, it may be that acquiescence 
on the MMPI is elicited differentially by certain 
content categories, or in relation with another 
stylistic component. 

The specific source of the variables which ap- 
pear to moderate (50) the operation of response 
acquiescence in the MMPI is obviously a com- 
plicated research problem which awaits more 
evidence for a definitive answer. One very prom- 


5 Messick, S. J., and Jackson, D. N. “Response 
Style and Factorial Interpretation of the MMPI.” In 
preparation. 
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ising lead, however, is encountered in another im- 
portant stylistic determinant of test-taking be- 
havior, the general tendency to endorse socially 
desirable or socially undesirable statements about 
oneself, This stylistic response tendency on the 
part of individuals should be distinguished from 
the judged characteristics of desirable and un- 
desirable item content. There is considerable 
evidence that this tendency is general and is re- 
lated to a tendency to respond in an idiosyncratic 
or atypical manner. Edwards (16) has reported 
a correlation of .87 between judged social de- 
sirability scale values and the proportion of re- 
spondents independently endorsing them. Hanley 
(29) obtained correlations of .82 and .89 re- 
spectively between probability of endorsement 
and social desirability ratings for samples of items 
from the MMPI D and Sc scales. Fordyce (17) 
correlated with the MMPI clinical scales a set 
of MMPI items judged to be socially desirable. 
His obtained correlations were high, ranging from 
—.38 to —.91. Although these coefficients indi- 
cate the importance of social desirability in scales 
like the MMPI, they also reflect the influence of 
response acquiescence, since the social desirability 
scale contained a disproportionate number of 
items keyed false. Jackson (33) showed that a 
combination of ranked indices of response ac- 
quiescence and social desirability on scales of the 
California Psychological Inventory was related to 
the rank of each scale’s correlation with the 
MMPI K scale to the extent of r, = .86. This 
value was higher than the correlation of either 
response style operating singly, suggesting the 
possibility of summative effects of response ac- 
quiescence and social desirability. 

Berg (8, 9), granting that there are modal 
response patterns, suggested that individual dif- 
ferences, particularly deviations, may be reveal- 
ing of personality style. Berg hypothesized that 
deviant behavior tends to be general and not 
specific to any particular content area. Barnes 
(3, 4) appraising the Berg deviation hypothesis 
in the MMPI, shed important light on the rela- 
tion between an acquiescent style and idiosyn- 
cratic responses. Barnes demonstrated a close 
correspondence between Wheeler, Little, and 
Lehner’s (57) first or “psychotic” factor and 
total number of items answered deviantly true, 
and between their second or “neurotic” factor 
and total number of items answered deviantly 


false. Although response acquiescence and the 
tendency to respond in a socially undesirable or 
deviant manner are confounded in Barnes’ anal- 
ysis, these results strongly support the notion 
that items judged low in social desirability evoke 
different tendencies toward acquiescence, as com- 
pared with items judged high in social desirability. 
This interpretation appears consistent with Welsh’s 
(55) data, where the first pure factor scale, com- 
posed of 38 “true” items out of 39, contains many 
socially undesirable statements, while the second 
pure factor scale, where all the items are keyed 
false, seems to consist predominantly of neutral 
or somewhat socially desirable statements. Here 
again, a consistent response style to acquiesce 
seems to be elicited differentially by a variety of 
self-deprecatory statements on the one hand, 
while, alternatively, neutral or mildly socially de- 
sirable statements evoke consistent differential 
tendencies to disagree or to be negativistic. 

Whether there are consistencies attributable to 
content after allowing for style in these first two 
factors or, indeed, in any obtained scores on the 
present form of the MMPI is an important re- 
search question, as is the relation between various 
content and stylistic factors and psychopathology. 
If Berg (8) is correct, if one might just as well 
use abstract drawings (3) as items to discriminate 
empirically psychiatric patients from normals, 
then it may be that content is less important and 
style more important than previously supposed. 
If this is the case, then past attempts to draw 
conclusions about respondents on the basis of 
their answers to uncontrolled item content are 
suspect. If, on the other hand, consistencies in 
content can be demonstrated above and beyond 
components of style, it is extremely important 
that measures of these content variables make 
adequate use of proper experimental controls to 
avoid as far as possible confounding with style. 
Use of recent advances in scaling theory (27, 53) 
might be helpful. 


MEASURING PERSONALITY STYLES 


In approaching the problem of the assessment 
of style, a curious dilemma presents itself. On 
the one hand, it is easy to show that most per- 
sonality tests are loaded with stylistic components, 
but on the other hand, good measuring devices 
for these dimensions do not exist, largely because 
few research workers have attempted explicitly 
to devise such scales. Typically, a single measure, 
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like the California F scale, the MMPI K scale, 
or Bass’s (6) collection of aphorisms, has been 
offered as an index of a response style, acquies- 
cence, for example. Little thought is given to 
the fact that these measures may not only contain 
several dimensions of content, but of style as 
well, thus limiting their usefulness as indices of 
any particular style. Thus, Fordyce (17) has 
suggested that the MMPI K scale reflects tend- 
encies to respond in a socially desirable manner, 
while Fricke (79) has argued that the K scale 
reflects acquiescence. Evidence from each of 
the two authors is convincing, and, indeed, Jack- 
son’s study (33) supports the notion that the K 
scale contains both acquiescence and social de- 
sirability variance. It may reflect other things 
as well, but this confounding is not conducive to 
its use as a measure of one particular style. The 
same criticism might be leveled at the California 
F scale, at Edwards’ (76) social desirability scale, 
and at Bass’s (6) social acquiescence scale, all 
of which seem to confound response acquiescence 
with social desirability. 

One way to construct measures of such styles 
as acquiescence or overgeneralization would in- 
volve selection of items extremely heterogeneous 
in content. Experimentally independent meas- 
ures of each style would, of course, be desirable. 
Since a response style to answer in a socially de- 
sirable or undesirable direction seems to be omni- 
present, it is hard to avoid in measures of other 
styles. Rather than attempting to develop items 
all at one level of social desirability, it might be 
better to vary social desirability systematically 
and to observe its relationships and interactions 
with other variables. Helmstadter (37) has de- 
scribed procedures for obtaining separate scores 
for different components of a test, some of which 
would be especially relevant to a situation in 
which one had already obtained social desirabil- 
ity scale values. Although social desirability has 
been assumed to be one-dimensional, it is easy 
to conceive of distinct, but perhaps correlated, 
dimensions consisting of items reflecting irre- 
sponsibility, psychiatric bizarreness, or hostility. 
The selection of sets of items for different di- 
mensions of judged social desirability would be 
facilitated by the application of recent advances 
in multidimensional scaling (44). Such refine- 
ments as separating out the components of social 
desirability would do much to clarify response 
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determinants and might put personality evalua- 
tion upon a more rigorous basis than has previ- 
ously been thought possible. 

Although the emphasis in this paper has been 
on some of the more conspicuous stylistic de- 
terminants encountered in common personality 
tests, there are many other possible measures of 
style that might be derived from personality 
theory. For example, a tendency to express a 
liking for diverse things, although it might be 
response acquiescence in a new disguise, might 
also represent greater cognitive differentiation or 
capacity to invest energy freely in objects in one’s 
environment. Such general expressions of “like” 
and “dislike” have been found to be reliable. 
On one set of 300 items dealing with diverse 
activities (57), the corrected split-half reliability 
of the tendency to respond “like” was .86. With 
a paucity of evidence on these issues, the alterna- 
tive to such conjecture is carefully planned re- 
search, for which there is an obvious need. There 
are many other research opportunities for the 
measurement of style, such as asking respondents 
to select from among two or more personality, 
attitude, or achievement items, equal in valence 
or correctness, but couched in different phrasings 
—perhaps one elaborate and pedantic, one simple, 
and one containing slang. Preferred modes or 
styles of expression might also be readily evalu- 
ated by techniques disguised as achievement tests 
(32). In this context, it would be interesting to 
evaluate personality correlates of such attributes 
as tolerance for logical contradictions within a 
passage, of a tendency to gamble on achievement 
tests (28, 54), and a variety of other consistent 
modes of response. Similarly, further research 
is needed to evaluate Jackson’s (34, 35) hypoth- 
esis that respondents who acquiesce consistently 
manifest a lower level of cognitive energy in 
other situations. 


SUMMARY 
It has been suggested that styfis 


kes) 
on some personality scales, pa oe, 
fornia F scale, the MMPI, an 


Psychological Inventory. In de 
ern of style it is important to 
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select not only those measures which have ap- 
peared by accident on already established tests, 
but to design assessment techniques explicitly to 
evoke theoretically important styles of response. 
Research involving response style may contribute 
to a more systematic measurement in personality 
and may pay off handsomely in helping to further 
the common ground between personality theory 
and personality assessment. 
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EMPIRICAL FINDINGS AND 
THEORETICAL PROBLEMS IN 
THE USE OF ANXIETY SCALES * 


IRWIN G. SARASON + 


In terms of productivity during the past 
decade, few areas of study in psychology have 
matched the output of research on scales of 


anxiety. 


While the inundation of papers on 


anxiety has impressed some workers and trou- 
bled others, it behooves us to inquire into (a) 


* Reprinted by permission from the Psychological 
Bulletin, September, 1960, Vol. 57, No. 5, 403-415. 

1 The preparation of this paper was facilitated by a 
grant (M-2397) from the National Institute of Men- 
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the stimulus value of anxiety scales for psychol- 
ogists, (b) the contribution of research on anxiety 
to the body of psychological knowledge, and (c) 
the problems for future study raised by this re- 
search. It is the purpose of this paper to attempt 
such an evaluation with emphasis on the relation- 
ship of anxiety to stress, learning, intelligence, 
physiological responses, other personality char- 
acteristics, and test-taking attitudes. The pur- 
port of the paper is not to present a general 
review of all studies on anxiety in these areas 
but rather to attempt to abstract from a large 
literature major trends which seem of present 
or potential significance. 

An attempt such as the present one seems par- 
ticularly appropriate in view of several recent 
evaluations of research involving anxiety scales 
which noted the unreplicability and inconsisten- 
cies of certain reported findings in this field (5, 8, 
34, 54, 72). Frustrating as this state of affairs 
may be, the present writer will attempt to show 
that unreplicability is not necessarily attributable 
to unreliability in the anxiety measuring instru- 
ments, but rather, to several “traditional” vari- 
ables such as characteristics of Ss and Es, and 
population and instructional variables which con- 
found with anxiety measures. 


THE STIMULUS VALUE FOR THE PSYCHOLOGIST 
OF ANXIETY SCALES 


In view of the centrality of the concept of 
anxiety in personality theory, it is somewhat sur- 
prising that attempts to measure the concept ob- 
jectively have developed only in recent years. 
Also, psychologists concerned with personality 
functioning might well be surprised at the con- 
text in which the first widely used anxiety scale 
was developed. A group of experimental psy- 
chologists interested in problems of learning was 
Tesponsible for the development of Taylor’s 
Manifest Anxiety Scale (MAS) (32, 113, 114, 
116). The main interest of these researchers in 
the MAS was in the measurement of Hull’s D 
in human Ss who were being studied in learning 
situations. 

Whereas the work stemming from the Iowa 
laboratory was concerned with the relationship of 
MAS to D, other researchers have inquired into 
the relationship between anxiety measures and a 
host of varied behaviors and situations (29, 31, 
36, 52, 79, 85, 103, 112, 124, 127). Motivated 


by the need for measures of personality relevant 
to such variables as intellectual performance, re- 
action to stress, and ability to learn, psychologists 
seized upon the objective, easily administered 
MAS. In view of the absence of measures of 
individual differences in anxiety, the motivation 
underlying the swift adoptions of the MAS seems 
clear. However, the criticism of Jenkins and 
Lykken (53) that in some research projects in- 
volving the MAS the rationale for its use has 
been lacking seems to be a just one. 

The availability of the MAS served to stimulate 
its use by researchers with varied interests, and 
it has also encouraged other investigators to con- 
struct other measures of anxiety better fitted to 
their specific needs (3, 26, 70, 74, 94, 122, 123). 
As a result, measures for specific anxieties such 
as test anxiety, social anxiety, and anxiety in 
children are now readily available. There is 
reason to believe that the various measures of 
anxiety in current use are not all measuring the 
same thing (35, 39, 41, 51, 67, 95, 108, 125, 
128). An important current problem is the 
clarification of the similarities and differences 
among existing anxiety indices. In this connec- 
tion, Jessor and Hammond (55), working within 
the framework of Cronbach and Meehl’s (79) 
concept of construct validity, have provided some 
very useful suggestions. It is to be hoped that 
the days of naive acceptance of the face validity 
of anxiety scales may be numbered and that more 
concern will be given to the theoretical bases 
underlying the use of particular measures of 
anxiety. 


RELATIONSHIP OF ANXIETY TO BEHAVIOR 


As has already been indicated, existing studies 
of anxiety literally defy summary as a unit. How- 
ever, it is possible to discern trends and problems 
in certain areas where anxiety scales have been 
employed, and it is with these that this paper will 
be concerned. 


ANXIETY AND STRESS 


Many investigators have studied the reactions 
of Ss differing in scores on anxiety scales to situa- 
tions posing personal threat or stress for Ss. 
Typically, the stress has been created by means 
of verbal instructions, e.g., informing S he is 
about to take an intelligence test. Most investi- 
gators have assumed that high anxious Ss would 
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be more sensitive to implied personal threat than 
would low anxious Ss. 

Although some investigators (18, 34, 45, 117) 
have presented evidence not consistent with this 
assumption, the bulk of the available findings 
suggest that high anxious Ss are affected more 
detrimentally by motivating conditions or failure 
reports than are Ss lower in the anxiety score 
distribution (24, 42, 64, 69, 74, 82, 88, 90, 91, 
96, 97, 98, 101, 120, 124). Illustrative of this 
type of study is that of Davidson, Andrews, and 
Ross (24) in which three variables were studied: 
(a) MAS scores, (b) reports to Ss of levels of 
failure, and (c) speed of presentation of task 
stimuli. Significant interactions were obtained 
among all of the variables, and the authors con- 
cluded that high anxious Ss are more sensitive to 
experimental stress than are low anxious Ss. 

In this connection it is interesting to note that 
high anxious Ss have been found to be more 
self-deprecatory, more self-preoccupied, and gen- 
erally less content with themselves than Ss lower 
in the distribution of anxiety scores (4, 16, 27, 
36, 49, 50, 119, 124, 127). It may well be that 
highly motivating or ego-involving instructions 
serve the function of arousing these self-oriented 
tendencies. One recent study (93) has shown 
that Ss scoring high in test anxiety respond more 
positively to reassurance in an experimental situa- 
tion than do low anxious Ss. A worthwhile 
problem for future research would seem to be 
the development of techniques for the extinction 
rather than the arousal of anxiety responses. 

Consistent with the interpretation of anxiety 
measures as indicators of sensitivity to implied 
personal threat is the finding by several investi- 
gators that there are no differences among groups 
differing in scores on anxiety scales when tested 
under neutral and apparently nonthreatening con- 
ditions (77, 88, 90, 91, 107). Sarason, in a 
series of three experiments (88, 90, 91) involv- 
ing the effects of anxiety and experimental stress 
on verbal learning, failed to find under pre- 
experimental neutral conditions significant dif- 
ferences in performance between groups which 
differed in anxiety, although varying perform- 
ance was obtained under later conditions of per- 
sonal threat. This suggests a sensitivity inter- 
pretation of anxiety similar to the one offered by 
Davidson, Andrews, and Ross (24). Further- 
more, evidence recently reported suggests the pos- 


sibility that the more directly related the content 
of items on the anxiety scale is to the situation 
in which Ss are to perform, the more useful is 
the measure of anxiety in showing interactions 
between scores on the scale and differential moti- 
vating instructions (84, 93, 95, 97, 98). 

The results of studies on anxiety and stress 
have led to what might be called a habit inter- 
pretation of anxiety (14, 24, 80, 82, 87, 94, 95). 
This interpretation, briefly put, states that Ss 
scoring high and low in anxiety differ in the 
response tendencies activated by personally 
threatening conditions. Whereas low scoring Ss 
may react to such conditions with increased effort 
and attention to the task at hand, high scoring 
Ss respond to threat with self-oriented, person- 
alized responses. More information is needed 
to clarify the conditions, such as those in the 
family and schoolroom environments, which are 
associated with the development of heightened 
responsiveness to stress. The rapidly burgeoning 
interest in the measurement of anxiety in children 
can be helpful in this regard (13, 71, 99, 121). 

A neglected problem in the creation of experi- 
mental stress situations is that of the E as an 
agent in creating a threat to S. Even when quite 
explicit motivating instructions are administered 
to S, there remains the problem of the adminis- 
tration of these instructions, The problem of 
variance among Es in the manner with which 
instructions are communicated cannot be over- 
emphasized. Systematic study is needed of the 
relationship between E variables such as sex and 
personality of E and anxiety aroused. 


ANXIETY AND TASK VARIABLES 


As has already been mentioned, the originators 
of the MAS considered it to be a measure of 
drive, D, and were primarily interested in relat- 
ing it to the concept of the response hierarchy. 
In simple, one-response situations such as eyelid 
conditioning, it was predicted that high anxious 
Ss would perform at higher levels than would 
low anxious Ss. However, as the complexity 
(e.g., intralist similarity) of the task to be learned 
increased, a superiority of low to high anxious 
Ss was expected. 

A number of the studies conducted within this 
framework have supported these assumptions 
(33, 81, 83, 109, 113, 118). For example, 
Montague (81) compared high and low anxiety 
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groups in ability to learn lists of nonsense syllables 
which differed in association value and intralist 
similarity. A significant interaction was ob- 
tained with low anxious Ss superior to high 
anxious Ss on the most complex or difficult task. 
On the least complex task, high were superior to 
low anxious Ss. These findings, although sub- 
ject to alternative interpretations, are in accord 
with Hullian expectations. Farber (32) and 
Taylor (116) have presented summaries and 
analyses of work on anxiety from a drive point 
of view. 

Despite these positive findings, a review of the 
literature also reveals a number of other studies 
either contradictory to or not consistent with a 
drive interpretation of anxiety (1, 7, 25, 46, 58, 
59, 86, 107). Several of these studies were 
specifically designed to test predictions from 
Hullian theory concerning the performance on 
either simple or complex tasks of groups scoring 
high and low on anxiety scales. An especially 
interesting experiment was performed by Bindra, 
Paterson, and Strzelecki (7). They did not ob- 
tain significant differences between high and low 
anxious Ss in-simple conditioning. It is interest- 
ing that their situation involved a nondefensive 
Tesponse rather than the defensive one used in 
many drive studies. The threatening aspects of 
receiving puffs of air in the region of the eye 
may be much more crucial in affecting perform- 
ance than the lack of response hierarchy com- 
petition hypothesized for such one-response sit- 
uations (48, 56). It seems likely that for cer- 
tain tasks there exists a confounding of task 
simplicity with task stressfulness. Korchin and 
Levine (64) have actually interpreted complexity 
of the learning situation not so much as a task 
variable but as a stress variable. Kausler and 
Trapp (60) have recently presented a critique 
of the drive interpretation of anxiety which dis- 
cusses other problems along these lines. 

One particular problem suggested by studies 
of anxiety in which complex tasks are used is 
this dual aspect of task complexity. A complex 
task can be difficult and at least potentially 
threatening to S. Under what conditions either 
or both of these aspects of task complexity are 
operative has as yet not been systematically 
studied. Certainly a closer tie-in between studies 
of anxiety and stress and studies of anxiety and 
task factors seems indicated. 


In an attempt in this direction, Sarason and 
Palola (98) manipulated simultaneously the vari- 
ables of anxiety, differential motivating instruc- 
tions, and task complexity in three experiments. 
Significant triple interactions involving the three 
variables studied were obtained in every case. 
These results are in accord with the dual proper- 
ties of task complexity already mentioned. They 
seem to suggest the necessity of developing an 
integrated interpretation of anxiety in terms of 
the experimental conditions most detrimental to 
the performance of high anxious Ss. For exam- 
ple, the combined use of high threat and high 
complexity of task might lead to larger differ- 
ences in performance between high and low 
anxious Ss than the manipulation of either threat 
or complexity alone. A study by Taylor (//7) 
illustrates the need for this type of research, and 
Nicholson (82) has recently presented findings 
consistent with this formulation. 

In addition to needed advances in theory in 
integrating the anxiety, motivational, and task 
variables, it is imperative that theories of anxiety 
also incorporate such variables as the sex of S$ 
and E. This was suggested by Kamin and Clark 
(58) and has been most dramatically illuminated 
by the results of a group of researchers at the 
University of Rochester (1, 46). These workers 
have consistently shown significant interactions 
between (a) anxiety scores, (b) sex of S, and 
(c) E characteristics. These latter two variables 
related more powerfully to anxiety of Ss than did 
task complexity, the primary focus of their re- 
search. As these authors point out, psycho- 
logical theory has failed to deal systematically 
with the S and E variables. 


ANXIETY AND INTELLIGENCE 


Although several investigators have reported 
negative relationships between MAS scores and 
intellectual performance for certain S populations 
(43, 62, 77, 104, 105, 110), the majority of 
studies relating measures of general anxiety to 
measures of intellectual performance have yielded 
nonsignificant correlations (20, 23, 40, 51, 63, 
78, 89, 95, 102, 115). 

Whether one should infer that high anxious 
Ss are less bright than other Ss when significant 
negative correlations between anxiety and in- 
tellectual performance are obtained depends on 
the interpretation placed on anxiety scales. The 
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finding that under stressful conditions low anxious 
Ss perform at higher levels than high anxious Ss, 
and under nonstressful conditions high and low 
anxious Ss perform equally well, might suggest 
that labeling a test as an intelligence test and the 
difficulty of the test itself may arouse anxiety 
responses in high anxious Ss which interfere with 
their performance. Motivational and situational 
variables associated with testing have not yet 
been manipulated systematically in studies which 
attempt to relate anxiety and intelligence. 

As was indicated earlier it would appear that, 
for college students, tests of the ACE type are 
unrelated to, or only very slightly related to, 
measures of general anxiety such as MAS. How- 
ever, studies which have related test anxiety, i.e., 
anxiety experienced in test situations, to measures 
of intellectual performance have shown consistent 
negative correlations. The Ss scoring high in 
test anxiety obtain lower performance scores than 
Ss with lower scores (17, 73, 92, 95, 100). In 
one study (95) in which both general and test 
anxiety indices were used, it was found that test 
anxiety correlated negatively with several intel- 
lectual measures for both male and female college 
students, but measures of general anxiety and 
other personality variables were unrelated to in- 
telligence. 

An important problem in the study of the cor- 
relation between anxiety and intelligence which 
has not been given enough emphasis is that of the 
range of intellectual ability studied. If restricted 
ranges of ability are used, it will make it less 
likely that significant correlations will emerge. 
Although investigators in this field are aware of 
the limited inferences one can draw on the basis 
of restricted sampling (e.g., college students, air 
force recruits, student nurses), no systematic 
attempts have been made to study the relation- 
ships between anxiety and intelligence in differ- 
ent populations using similar measures of anxiety 
and intelligence in all comparisons. Spielberger 
(110) and Calvin, Koons, Bingham, and Fink 
(11) have presented evidence which strongly 
suggests the need for such a systematic considera- 
tion of sampling variations. 


ANXIETY AND PHYSIOLOGICAL VARIABLES 


As anxiety is defined clinically, it is typically 
assumed that it has important physiological cor- 
relates. On the basis of assumptions of this type, 


several investigators have sought relationships be- 
tween anxiety and a variety of physiological 
measures (e.g., GSR). Although work in this 
area seems only to be getting under way, the 
results to date have been largely negative. Meas- 
ures of questionnaire-defined anxiety such as 
MAS do not seem to relate consistently to physio- 
logical responding (2, 6, 12, 68, 84). Although 
these negative findings can be taken as reflecting 
poorly on the validity of MAS-type scales, it may 
also be that these scales are tapping aspects of 
anxiety other than autonomic functioning. It is 
known that there are marked individual differ- 
ences among Ss in their physiological response 
patterns under stress conditions (65, 66). Conse- 
quently, in research relating anxiety and auto- 
nomic response, it would seem desirable to study 
patterns of physiological responding rather than 
only one physiological response measure. 

Another important variable as yet unstudied 
in this area relates to the conditions under which 
Ss’ physiological responses are measured (75). 
The situational and experimental conditions under 
which an hypothesized relationship should be 
present or not present have not been explored. 
It is known that even patients diagnosed as anx- 
iety states do not display anxiety symptoms at 
all times and do not always show the same pat- 
terns of symptoms. Just as the habit interpreta- 
tion of anxiety would expect that, if one wished 
to maximize differences between high and low 
anxious Ss on intelligence tests, Ss would have 
to be run using highly motivating, ego-involving 
conditions, so also physiological differences be- 
tween Ss differing in anxiety might occur only 
under stressful or motivating conditions. 


MEASUREMENT OF ANXIETY AND ITS RELATION TO 
PERSONALITY AND TEST-TAKING ATTITUDES 


In constructing tests of personality we must 
ask ourselves many questions related to their 
reliability and validity. What does the test pur- 
port to measure? What is the best format for 
the test? How does the test relate to other avail- 
able instruments? What are the best ways in 
which to establish the validity of the test? 

Jessor and Hammond (55) have pointed out in 
relation to anxiety scales that some of these ques- 
tions can ultimately be answered through the 
process of construct validation. Unfortunately, 
at present, the construct validation of anxiety 
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scales is at a rudimentary stage. For example, 
is a true-false paper and pencil test the most 
appropriate measure of anxiety?. At present we 
do not know the answer to this question. Prob- 
ably the major reason for the wide use of paper 
and pencil indices of anxiety is convenience. 
While convenience is a desirable characteristic, 
research is needed to investigate less convenient 
but perhaps more useful indices. 

Perhaps the most parsimonious statement that 
one can make concerning what is measured by 
existing scales of anxiety is that they measure the 
extent to which an individual is willing to admit 
to experiencing anxiety in certain situations. 
However, also to be considered are the following 
possibilities: (a) high anxiety scores may be 
obtained by certain Ss because of plus-getting 
tendencies, i.e., tendencies to attribute “bad” 
characteristics to themselves; (b) high scores may 
be obtained by particularly frank and open Ss; 
(c) high scores may be obtained by Ss who are 
particularly perceptive of their own reactions. 
The converse of each of these possibilities repre- 
sents a possible basis for low anxiety scores. 

In this connection, it should be pointed out 
‘that many true-false scales of anxiety have been 
found to correlate very highly and negatively 
with measures of defensiveness, test-taking atti- 
tude, and the tendency to respond to personality 
test items in a socially desirable direction (28, 
38). Such high correlations may indicate that 
anxiety scores are explainable in terms of test- 
taking attitude. Whether or not this is true is a 
problem that construct validation studies should 
be designed to answer. 

Interestingly, it is possible to construct scales 
of anxiety which do not correlate very highly 
with measures of test-taking attitude. The writer 
(95) has obtained correlations between the Test 
Anxiety Scale and SD of —.49 for women and 
—.23 for men. Also several forced-choice anxiety 
scales have been presented which to a very con- 
siderable extent seem to reduce the correlations 
between anxiety scores and measures of test- 
taking attitude (15, 47, 70, 106). However, it 
is possible that forced-choice techniques in the 
field of personality measurement create as many 
problems as they solve (44, pp. 188-189). More 
research designed to measure anxiety in a variety 
of ways and to better understand anxiety and 
test-taking attitude relationships seems indicated. 


Two additional areas which require further 
study are the relationship of measures of anxiety 
to (a) other personality dimensions and (b) to 
the clinical conditions of patients. With respect 
to anxiety and other personality measures, it ap- 
pears that at least one test, the Psychasthenia, 
Pt, scale of the MMPI, correlates as highly with 
the MAS as the MAS correlates with itself (9, 
25, 30). Although Pr-MAS item overlap is 
clearly a factor in these high correlations, this 
relationship between MAS and Pr may suggest 
that high scorers on anxiety scales obtain such 
scores because of ruminative, obsessive thinking 
about themselves. If scales of anxiety, or at 
least the MAS, are measuring a variable related 
to obsessive-compulsive tendencies, then the posi- 
tive correlations reported by some investigators 
(22, 103) between MAS and measures of author- 
itarianism perhaps might be explained in terms 
of the dogmatism. and rigidity often observed in 
neurotic obsessive-compulsive personalities. 

As was mentioned earlier, the weight of the 
available evidence indicates that scales of anxiety 
are tapping tendencies towards neuroticism, mal- 
adjustment, and self-dissatisfaction (4, 16, 23, 
36, 50, 126). There are indications also that 
this heightened insecurity of high anxious indi- 
viduals may result in a greater susceptibility to 
persuasion and opinion change, and to greater 
sensitivity to reinforcements provided by E to S 
in learning situations (37, 52, 94, 111). For 
example, in two similar verbal conditioning 
studies both Taffel (777) and Sarason (94) found 
that high anxious neuropsychiatric patients 
changed their frequency of usage of a verbal 
response class reinforced by E more easily than 
did patients with lower anxiety scores. 

These sorts of relationships are consistent with 
the observations made in psychotherapeutic con- 
tacts with patients that likelihood of movement 
in therapy is, to a considerable extent, positively 
related to the patient’s anxiety level. However, 
results of studies on the diagnostic value of in- 
dices of anxiety do not as yet fall into clearly 
discernible patterns, and it is hard to draw gen- 
eralizations concerning the value of these indices 
as diagnostic tools. It can be said that a number 
of investigators have found anxiety scales to be 
correlated either with indices of general malad- 
justment or ratings of anxiety made by clinicians 
(10, 49, 67, 76, 112). The magnitude of these 
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correlations, while significant, has often been so 
low as to preclude use in the individual case, 
Kendall (61) has suggested that MAS be re- 
garded as only a rough clinical tool. 

An important methodological problem in re- 
lating anxiety indices to the ratings of patients’ 
anxiety made by clinicians is the method of ob- 
taining such judgments. Poorly constructed 
rating scales will inevitably lead to low-order 
relationships with other measures. In this regard, 
attention should be called to the interesting 
study by Buss et al. (10) in which the use of 
adequate procedures to insure interrater reliability 
among clinicians no doubt contributed to the 
positive results obtained. 


SUMMARY 


This paper has dealt with the relationship of 
anxiety to certain research areas, Existing re- 
search suggests the following summaries: 

1. The performance of high anxious Ss is 
detrimentally affected by verbally administered 
highly motivating communications. This result 
is consistent with the view that high anxious Ss 
emit personalized, self-oriented interfering re- 
sponses when threat is perceived in the environ- 
ment. Under nonthreat conditions the emission 
of such responses would not be expected. It was 
pointed out that several methodological problems 
remain to be solved in the assessment of the rela- 
tionship between anxiety and stress. On the E 
side there is the confounding of variables such 
as experimental instructions with characteristics 
of the E administering such instructions. On 
the S side, more must be learned about the rela- 
tionship of sex and personality characteristics of 
Ss which affect their responses to conditions of 
implied threat. 

2. The results of several experiments using 
MAS as a measure of drive have indicated that, 
as task complexity increases, the disadvantage of 
high to low anxious Ss appears to increase. How- 
ever, there has been considerable research in 
which this relationship was not confirmed. Per- 
haps the major theoretical problem in the anxiety- 
task complexity relationship is the interpretation 
to be placed on the complexity variable. Com- 
plex tasks can be both difficult and emotionally 
arousing. It would appear that both of these 
aspects of task complexity must be considered. 

3. Although several reports of correlations be- 


tween measures of general anxiety, such as the 
MAS, and intellectual measures are to be found 
in the literature, it does not appear that this rela- 
tionship consistently holds. Specific test anxiety, 
on the other hand, does seem to relate negatively 
to intellectual measures. It has been suggested 
that indices of specific anxieties such as test 
anxiety may prove more valuable for specific pur- 
poses than more general indices like MAS. 

4. Negative findings seem to pervade the study 
of the relationship of anxiety to physiological 
indices. The typical procedure has been to select 
Ss differing in anxiety scores and to compare these 
Ss on autonomic measures such as GSR. It was 
suggested that the lack of significant relationships 
in such comparisons may be attributable to a 
failure to make the comparisons under conditions 
of perceived threat or stress. High and low 
anxious Ss may differ in physiological response 
under threat but not under nonthreat conditions. 

5. Problems of the effects of test-taking atti- 
tudes on anxiety scores and the format of anxiety 
scales have as yet not been given the intensive 
study which they merit. While most indices of 
anxiety of the MAS type have been found to cor- 
relate negatively and very highly with measures 
of test-taking attitudes (e.g., the K scale of the 
MMPI), this has not been obtained in all cases. 
Forced-choice techniques and the Test Anxiety 
Scale do not correlate as highly with test-taking 
attitude as do MAS and other general anxiety 
indices. Further construct validation of both 
anxiety and test-taking attitude scales may il- 
luminate the significance of these findings. 

The aim of this paper has been to point to 
some of the consistencies and inconsistencies in 
the area of anxiety research and to suggest some 
of the uncontrolled and confounding variables 
which may have led to discrepant findings and 
which need to be systematically studied in future 
research. 
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ANTI-SEMITISM AND. THE 
DISPLACEMENT OF 
AGGRESSION * 


LEONARD BERKOWITZ * 


As is well known, the psychoanalytic principle 
of displacement maintains that aggressive tend- 
encies denied expression against the object in- 
stigating the aggression tend to be directed against 
noninstigating objects. Long part of the social 
seience folklore, this principle is given empirical 
support in three studies reported by Dollard, 
Doob, Miller, Mowrer, and Sears (4, pp. 42-44) 
in their now classic work, Frustration and Ag- 
gression, 

In all of these experiments the frustrated or- 
ganisms displayed increased unfriendliness toward 
objects that happened to be available. However, 
negative findings obtained in more recent research 
(16) indicate that the choice of a target for dis- 
placed aggression is not necessarily solely a func- 
tion of availability. As Miller (13) has shown, 
an object's likelihood of receiving displaced ag- 
-gression depends upon such stimulus factors as 
its similarity to the instigator and whether the 
direct overt hostility against the instigator is pre- 
vented by the instigator’s absence or by internal 
conflict. In the former case, according to Miller's 
analysis, displaced responses occur to other sim- 
ilar objects, and the strongest aggressive act oc- 
curs to the most similar object present. If there 
is conflict, however, on the assumption that the 
gradient of generalization of the interfering re- 
Sponses is steeper than that of the aggressive re- 
sponses they inhibit, the strongest hostile response 
is predicted to occur to stimulus objects that have 
an intermediate degree of similarity to the orig- 
inal instigating object. Supporting evidence is 
found in two recent correlational investigations 
(14, 17) employing fantasy productions. 

The objective stimulus situation may not be 


* Reprinted with minor abridgement by permission 
from The Journal of Abnormal and Social Psychology, 
September, 1959, Vol. 59, No. 2, 182-187. 

1This study was supported by Research Grant 
M1540 from the National Institute of Mental Health, 
U. S. Public Health Service. Douglas Holmes ren. 
aie invaluable assistance in collecting the present 

ata. 


the only determinant of the tendency to displace 
aggression, however. Many students of inter- 
group relations have suggested that variations in 
this tendency also are associated with the per- 
sonality characteristics of the angered individuals. 
Thus, according to the “scapegoat” theory of 
prejudice, highly prejudiced individuals are more 
likely than those who are less prejudiced to re- 
duce their hostility by displacing it upon minority 
groups.” But although this contention has had 
a considerable vogue among social scientists (3, 
6), there have been few attempts to test the 
validity of this theory experimentally, and at 
least one test (77) has yielded negative results. 
Nor have there been many attempts to spell out 
explicitly the processes accounting for the scape- 
goat reaction. For example, if the prejudiced 
person’s hostility toward minority groups is a case 
of displaced aggression, is this due solely to a 
high level of aggressive drive or to other factors 
unique to his type of personality, such as his 
learning to regard others (particularly minority 
group members) as just as frustrating as the 
present instigator of his hostility? Fromm (8) 
and Maslow (72) are in agreement in character- 
izing the authoritarian personality, who, of course, 
is likely to be highly prejudiced, as possessing a 
relatively high level of hostility, while Siegel (75) 
has found that the measure of this personality 
type, the F scale, is positively correlated with an 
index of manifest hostility. If the former alterna- 
tive were correct, then, there should be a greater 
incidence of displacement in experimentally 
angered than in less angered subjects (Ss), re- 
gardless of whether these Ss are characteristically 
prejudiced or not, as long as their level of aggres- 
sion is the same, The studies by Miller and 
Bugelski (4, pp. 42-44) support this possibility. 
However, if other individual characteristics affect 
the displacement tendency (as is suggested by 
the failures to replicate the Miller-Bugelski re- 
sults) and if these are correlated with persistent 
intolerance toward minority groups, prejudiced 
Ss should be more likely to respond to a frus- 
tration with displaced aggression than less preju- 


2 It is not clear from most descriptions of the role 
of displaced aggression in social prejudice as to 
whether the displacement is a general reaction or is 
elicited only by objects having certain stimulus char- 
acteristics, e.g., Negroes or Jews. At least one paper 
in this area (4) suggests that the former is a likely 
Possibility. 
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diced individuals, even though both groups are 
equally angry. 

The present study was designed to test these 
alternatives. Ss differing in their level of anti- 
Semitism were either angered or not by E and 
then assigned to work with a partner on a neutral 
task. To the extent that the aroused anti-Semitic 
Ss exhibit more hostility toward their partners 
than the similarly angered less prejudiced Ss, 
reliable individual characteristics, as well as the 
objective nature of the stimulus objects, will have 
been shown to affect the displacement of aggres- 
sion. 


METHOD 


Subjects—The Ss were 48 female volunteers 
from introductory psychology classes at the Uni- 
versity of Wisconsin distributed evenly among 
the eight conditions in a 2 X 2 X 2 factorial de- 
sign. The girls were classified as either high or 
low in anti-Semitism and, within each of these 
groups, either high or low in aggressive drive, on 
the basis of their responses to a questionnaire 
administered about a month before the experi- 
ment. The former classification made use of 
the Ss’ scores on 17 Likert-type items compris- 
ing a modified version of the anti-Semitism scale 
developed by the California investigators of the 
authoritarian personality (7) with the high and 
low anti-Semitic Ss (High A-S and Low A-S) 
taken from the upper and lower thirds of the 
total distribution of scores on this scale. The 
aggressive drive classification was an attempt to 
control for the level of hostility within the Ss, 
employing their scores on Siegel’s Manifest Hos- 
tility Scale (MHS) * (15), much as the Taylor 
Manifest Anxiety Scale has been used in con- 
trolling the level of anxiety drive. However, as 
we would expect on the basis of Siegel’s results, 
it was found that anti-Semitism was significantly 
associated with manifest hostility level. There- 
fore, while both the high and low A-S Ss were 
dichotomized into high and low MHS Ss, the 
high A-S Ss as a group are higher in manifest 
hostility than the low A-S Ss (p <.05). Table 1 
presents the mean MHS score in each of the 
experimental conditions. 

® More recent evidence points to important difficul- 
ties in the interpretation of scores on this scale. Two 
studies (2, 9) agree in reporting a negative correla- 
tion between MHS scores and the increase in intensity 
of aggressive behavior after arousal. 


TABLE 1 


MEAN Scores IN EACH OF THE 
EXPERIMENTAL CONDITIONS ® 


Hostility 


Arne Non-Arousal 
MHS Level 
Measure 
High Low 
A-S A-S 
High 
MHS 31.2 23.3 
Annoyance è 7.5 6.0 
Like Pe 12.8 17.7 
Low x 
MHS 13.7 9.3 
Annoyance 8.0 7.7 
Like P 15.5 15.0 
All 
MHS 22.4 16.3 
Annoyance 1.8 6,8 
Like P 14.2 16.3 


* N = 6 in each condition. 

è The higher the score the less S enjoyed the experi- 
ment. 

° The higher the score the less S liked her partner. 


PROCEDURE 


The experimental manipulations began as soon 
as any given S showed up for her appointment, 
ostensibly for an experiment on “problem-solv- 
ing.” In the Hostility Arousal condition, E met 
S with the statement that she was late and sar- 
castically asked whether this was always true. 
He then took her into another room (after first 
closing a door in her face) where he introduced 
S to a concept formation task. E deprecated S’s 
performance during this 10-minute task and 
questioned her ability to do well in college. In 
the Non-Arousal condition, E maintained either 
a neutral or moderately friendly attitude toward 
S. Following the completion of the concept for- 
mation task, E brought another girl into the room 
saying that this girl also had just finished the first 
phase of the experiment. In actuality, the new 
girl was a paid confederate of E. They were now 
to work together on another kind of problem: to 
decide how a mediation board would have voted 
in a supposedly real labor-management dispute. 
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difference stems largely from the different meas- 
ures of displaced aggression in the two experi- 
ments. Lindzey used two indices of displace- 
ment, one based upon the number of incidents 
in the Ss’ TAT protocols in which “self” figures 
carried out aggressive behavior against “nonself” 
figures, the other the extrapunitive scale of the 
P-F Study. Conceivably, these measures could 
have low reliability and validity as indices of dis- 
placement. It also is reasonable to assume that 
the TAT index assesses, not displaced but, rather, 
a general fantasy aggression. Support for this 
possibility is found in a recent experiment by 
Hokanson and Gordon (9). They noted that Ss 
with low scores on Siegel’s MHS tended to in- 
crease in fantasy aggression after arousal, while 
the High MHS Ss had lower scores after arousal. 
Since anti-Semitism is positively correlated with 
MHS, this finding is consistent with Lindzey’s 
observation that the less anti-Semitic Ss had a 
somewhat higher fantasy aggression score (our 
term for his measure) after arousal than the more 
highly prejudiced Ss. 

There is little doubt either that the displace- 
ment of aggression occurs or that this phenome- 
non can play an important part in many (but 
certainly not all) cases of social prejudice. How- 
ever, it is equally apparent that displacement is 
not necessarily elicited in angered persons simply 
by preventing attacks against the instigator of 
their hostility. A target’s likelihood of receiving 
displaced hostility varies depending upon a num- 
ber of factors, such as (a) whether the direct 
attacks against the instigator of aggression are 
prevented by internal conflict or by his absence, 
(b) the degree of similarity between the given 
available target and the perceived instigator of 
aggression, (c) the relative strength of the tend- 
encies to aggress and to inhibit the aggression, 
(d) the nature of the threat arousing the anger, 
and (e) the personality of the angered individual. 
Taking these in order, Miller’s (13) theoretical 
analyses and the research stimulated by him (14, 
17) demonstrate the importance of the first three 
factors. Feshbach and Singer (7) show that a 
personal or ego threat often results in increased 
hostility and in social prejudice (i.e., displaced 
hostility), while a “shared” threat results in a 
decrement in the expression of social prejudice, 
and the present study indicates that character- 
istically prejudiced (anti-Semitic) Ss: are more 


likely to displace aggression when angered than 
less prejudiced Ss. 


SUMMARY 


The present study was designed to demonstrate 
that an object’s likelihood of receiving displaced 
aggression varies not only with the objective 
stimulus factors discussed by Miller (/3) and 
others (4) but also with the personality character- 
istics of the angered person. In particular, it 
was hypothesized, in accord with some versions 
of the scapegoat theory of prejudice, that highly 
anti-Semitic Ss would be more likely to displace 
aggression when angered than less anti-Semitic 
Ss. The results support the hypothesis. Several 
factors possibly accounting for this displacement 
are discussed. Evidence is provided consistent 
with the view that prejudiced people are likely 
to exhibit a diffuse projectivity when frustrated, 
i.e., to generally blame others for the annoyance 
they feel and deny self-responsibility for their 
plight. 


REFERENCES 


1. Adorno, T., Frenkel-Brunswik, E., Levin- 
son, D., and Stanford, R. The authoritarian 
personality. New York: Harper, 1950. 

2. Berkowitz, L. Manifest hostility level and 
hostile behavior. J. soc. Psychol., in press. 

3. Brown, J. F. The origin of the anti-Semitic 
attitude. In I. Graeber and S. H. Britt (Eds.), 
Jews in a. gentile world. New York: Mac- 
millan, 1942. 

4. Dollard, J., Doob, L., Miller, N., Mowrer, O., 
and Sears, R. Frustration and aggression. 
New Haven: Yale Univer. Press, 1939. 

5. Elliott, D., and Wittenberg, B. Accuracy of 
identification of Jewish and non-Jewish photo- 
graphs. J. abnorm. soc. Psychol., 1955, 51, 
339-341. 

6. Fenichel, O. Elements of a psychoanalytic 
theory of anti-Semitism. In E. Simmel (Ed.), 
Anti-Semitism: A social disease. New York: 
International Universities Press, 1946. 

7. Feshbach, S., and Singer, R. The effects of 
personal and shared threats upon social preju- 
dice. J. abnorm. soc. Psychol., 1957, 54, 
411—416. 

8. Fromm, E. Escape from freedom. New 
York: Rinehart, 1941. 

9. Hokanson, J., and Gordon, J. The expres- 
sion and inhibition of hostility in imaginative 


10. 


Il. 


12. 


13: 


14. 


15. 


16. 


PAPER AND PENCIL MEASURES OF PERSONALITY 39 


and overt behavior. J. abnorm. soc. Psychol., 
1958, 57, 327-333. 

Lesser, G. S. Extrapunitiveness and ethnic 
attitude. J. abnorm. soc. Psychol., 1958, 56, 
281-282. 

Lindzey, G. An experimental examination of 
the scapegoat theory of prejudice. J. abnorm. 
soc. Psychol., 1950, 45, 296-309. 

Maslow, A. Authoritarian character struc- 
ture. J. soc. Psychol., 1943, 18, 401-411. 
Miller, N. Theory and experiment relating 
psychoanalytic displacement to stimulus-re- 
sponse generalization. J. abnorm. soc. Psy- 
chol., 1948, 43, 155-178. 

Murney, R. An application of the principle 
of stimulus generalization to the prediction of 
object displacement. Washington, D. C.: 
Catholic Univer. America Press, 1955. 
Siegel, S. The relationship of hostility to au- 
thoritarianism. J. abnorm. soc. Psychol., 
1956, 52, 368-372. 

Stagner, R., and Congdon, C. Another fail- 
ure to demonstrate displacement of aggres- 
sion. J. abnorm. soc. Psychol., 1955, 51, 
695-696. 


17. Wright, G. Projection and displacement: A 
cross-cultural study of folktale aggression. J. 
abnorm. soc. Psychol., 1954, 49, 523-528. 


GENERAL REFERENCES 


Buss, A. The psychology of aggression. 
York: John Wiley, 1961. 

Cattell, R. B., and Scheier, I. H. The meaning 
and measurement of neuroticism and anxiety. 
New York: Ronald Press, 1961. 

Dahlstrom, W. G., and Welsh, G. S. An MMPI 
handbook. Minneapolis: University of Minne- 
sota Press, 1960. 

Edwards, A. L. The social desirability variable in 
personality assessment and research. New 
York: Dryden, 1957. 

Sarason, S. B., Davidson, K., Lighthall, F., 
Waite, R., and Ruebush, B. K. Anxiety in ele- 
mentary school children. New York: John 
Wiley, 1960. 

Welsh, G. S., and Dahlstrom, W. G. (eds.) Basic 
readings on the MMPI. Minneapolis: Univer- 
sity of Minnesota Press, 1956. 


New 


SECTION Il 


Personality 
Measures 
Based on 
Subject’s 

Fantasy 


41 


The use of Rorschach inkblots and other tech- 
niques for eliciting fantasy productions usually 
is based on the assumption that the individual 
projects characteristics and tendencies personally 
relevant and often covert into his description of 
a relatively ambiguous stimulus. Although these 
tests are in wide use by clinical psychologists 
because of the clinician’s need to make statements 
about underlying covert influences on overt be- 
havior, there is still widespread agreement con- 
cerning the need for a better understanding of 
the methods available for tapping fantasy ma- 
terial and a need for improvement of existing 
techniques. 

The first paper of this section presents evidence 
which sheds light on the question: what is the 
relationship between personality questionnaire 
and fantasy measures obtained from the same 
subjects. Using a sample of children, Sarason, 
Davidson, Lighthall and Waite selected subjects 
differing in anxiety scores, and, then, compared 
these groups with respect to intellectual perform- 
ance and Rorschach responses. 

Whereas Sarason et al. were interested in exam- 
ing Rorschach protocols as a function of subjects 
anxiety scores, Gross sought to determine the 
effect of certain aspects of the testing situation 
on Rorschach performance. Gross’ finding that 
the testers’ differential reinforcement of the sub- 
jects’ behavior influences Rorschach responses is, 
in one sense, similar to the findings discussed in 
Section One concerning the influence of response 
sets on personality questionnaire performance, 
In both cases, “irrelevant” factors were found to 
play important roles in affecting test behavior. 
Rotter, in an article contained in Section Three, 
examines further the implications of these sorts 
of variables for personality assessment. 

The Thematic Apperception Test (TAT) dif- 
fers from the Rorschach primarily in that the 
TAT cards are more structured than the Ror- 
schach cards. That is, the Rorschach cards are 
simply inkblots while most of the TAT cards 
depict more recognizable scenes and interper- 
sonal situations. If fantasy material of the TAT 
type can be related to overt behavior, then it would 
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seem worthwhile to investigate many possible the- 
matic variables in order to find out which ones 
yield the most consistent relationships to behavior. 
Of all of the dimensions or variables inferred from 
thematic material probably none has received 
more careful experimental analysis than that of 
need for achievement. The experiment by Mc- 
Clelland, Clark, Roby and Atkinson was one of 
the first investigations of this variable. In their 
article they summarize the procedures used (1) 
in scoring need for achievement, and, impor- 
tantly, (2) in attempts at experimentally manip- 
ulating need for achievement. 

Most users of TAT-type instruments assume 
that the subject, when confronted with a picture 
of persons involved in a given scene or activity, 
will invest in one of the persons attributes most 
characteristic of himself. This person may be 
considered the hero of the story. Lindzey and 
Kalnins, in their research have made an attempt 
to empirically determine the tenability of this 
“hero assumption.” In general, their findings 
support the hypothesis that subjects see the heroes 
of their stories as being more similar to them- 
selves than the non-heroes. 

Just as normative data for intellectual and 
achievement tests must be re-examined and re- 
vised with the passage of time, so this process 
also is true for personality tests. Jenkins and 
Russell, in their article in Section Two, present 
a carefully executed investigation of changes in 
the frequency of occurrence of response words 
to stimulus words on the Word Association Test. 
In their research, Jenkins and Russell compare 
Word Association Test responses made by a 1927 
and by a 1952 sample of subjects. Their find- 
ings of many significant differences in response 
patterns between these two samples will prove 
valuable to future researchers interested in the 


factors affecting Word Association Test perform- 
ance. 


RORSCHACH BEHAVIOR AND 
PERFORMANCE OF HIGH AND 
LOW ANXIOUS CHILDREN * 


SEYMOUR B. SARASON, KENNETH DAVIDSON, 
FREDERICK LIGHTHALL, AND RICHARD WAITE * 


The present article deals with the performance 
and behavior of high anxious (HA) and low 
anxious children (LA) in the Rorschach situa- 
tion. This study is part of an ongoing research 
project on the measurement and correlates of 
anxiety in school age children. Previous reports 
(3, 6, 7, 8, 9) from this project have described 
the construction of Test Anxiety (TA) and Gen- 
eral Anxiety (GA) Scales and, in addition, pre- 
sented data indicating that these scales had an 
encouraging degree of validity. 

In an attempt to explore further the validity 
of these scales we chose the Rorschach situation 
because it is one in which the child has to solve 
problems involving culturally unfamiliar stimuli 
with practically no aid or direction from another 
person. The method of administration with its 
underlying rationale has previously been de- 
scribed by Sarason (5, p. 1/1). It should be 
emphasized that the method of administration 
involves no questioning or prodding in the per- 
formance and very little questioning in the in- 
quiry. Briefly, our expectations were that in a 
problem-solving situation in which the child is 
required to make his own decisions, there would 
be more signs of interference and ineffectiveness 
in the functioning of the HA child than in the 
case of the LA child. 


SUBJECTS 


Thirty-two pairs of subjects were matched for 
grade, sex, and average verbal and nonverbal 
T-score on the Otis Quick-Scoring Mental Ability 
Test (Alpha). They differed in that one member 
of each pair was in the fourth quartile of scores 
on Test Anxiety and General Anxiety Scales (the 
HA subject) while the other member of the pair 


* Reprinted by permission from Child Develop- 
ment, June, 1958, Vol. 29, No. 2, 277-285. 
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was in the first quartile (the LA subject). There 
were 16 boys and 16 girls in each of the two 
anxiety groups. The 32 pairs were drawn from 
a sample of 747 subjects in grades 1 through 4, 
who had been given the anxiety questionnaires 
several months earlier. At the time of the Ror- 
schach the subjects were distributed in grades 2 
through 5. The total sample of 747 excluded 
subjects with a known academic or behavior 
problem and those whose parents were divorced 
or separated. 


PROCEDURE 


The Rorschach testing was done by two female 
examiners. Following the inquiry of each re- 
sponse, the subject was given a piece of tracing 
paper with the request to trace the response so 
that the examiner “could see it just as you did.” 
After the test was over, a description of the sub- 
ject’s behavior was written, averaging approxi- 
mately one and one-half double-spaced pages in 
length. These behavior descriptions were not 
made with any fixed schedule of traits in mind. 
The examiners were told to describe what to 
them were the dominant characteristics of the 
child’s behavior, and also to include their own 
spontaneous reactions to the child’s behavior. 

One of the writers (S.B.S.) spent approxi- 
mately 20 to 40 minutes in a somewhat super- 
ficial clinical analysis of each protocol (including 
the behavior description). Brief notes were made 
during the analysis, and in each case a judgment 
was made as to whether the subject was HA or 
LA, i.e., whether the child answered the ques- 
tionnaires as an HA or LA subject. Following 
this, the 64 records were arranged for the clinician 
into the 32 matched pairs with the purpose now 
of re-evaluating the previously recorded judg- 
ments and making whatever changes seemed indi- 
cated when examining each member of the pair 
together, the group to which each member be- 
longed still being unknown to the clinician. The 
comparison of the matched pairs was done before 
the results of the first analysis were known. 

The clinical analyses were based both on the- 
oretical conceptions concerning anxiety as well 
as the particular psychologist’s clinical experience 
with children. Therefore, the major criteria em- 
ployed subjectively by the clinician were then 
explicitly stated and tested statistically. 

The protocols were also scored in the conven- 
tional manner, using Hertz’s (4) manual as a 


guide. The two Rorschach examiners independ- 
ently scored 10 protocols and agreement on 73 
per cent of the responses was obtained. Each of 
the remaining 54 protocols was then scored by 
both examiners in conference. Where there was 
a difference in scoring a response, an additional 
judge was consulted. It might be helpful to 
record our opinion that, in the absence of trac- 
ings, the scoring of these records would have been 
markedly less reliable than it was. 

In the present study, as in all other aspects of 
this project, the collection and analysis of the 
data were done without knowledge of the group 
to which a child belonged. 


RESULTS 


Clinical Analysis of the 64 Records——Forty 
of the 64 cases were placed in the proper anxiety 
group, the p value for this result being .03. 
When the 64 records were then arranged into 
the 32 matched pairs with the opportunity to 
make new judgments, 18 of the 32 pairs were 
correctly categorized, this result being signifi- 
cant at only the .30 level. The clinician making 
the judgments experienced great difficulty with 
12 of the matched pairs, these pairs containing 
cases who in the previous analysis had also been 
a source of difficulty. In making his final judg- 
ment with these 12 pairs the clinician explicitly 
indicated that he was making his judgments with 
much doubt. The doubt did not concern the 
judgment about the subject's anxiety in the Ror- 
schach situation but whether or not the subject 
would admit to anxiety on the questionnaires. 
It should also be noted that all judgments on the 
nonquestioned pairs took approximately 30 to 
45 minutes. In the 20 pairs which were not 
questioned, 15 were correct (p = 03). (The 
above p values were obtained using the sign test 
procedure.) 

When these results were known, the clinician 
then reviewed the incorrectly matched pairs in 
order to determine, if possible, the reasons for 
the mismatching. It quickly became clear that 
there were a number of individuals in these pairs 
for whom the examiner explicitly had used such 
phrases as: “He is a very guarded child,” “I felt 
I had no rapport at all with this child,” “He was 
very defensive and I don’t know what was going 
on inside.” It also seemed as if it was largely 
LA cases who were so described and that such 
behavior was interpreted by the clinician as a 
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reflection of anxiety in the child. The two exam- 
iners independently then read each of the 64 
behavior descriptions with the instruction to put 
each child into one of two categories: The child 
was clearly of the “guarded” type or he was not. 
There was complete agreement in 13 cases that 
the child was guarded. There were six cases 
where one examiner categorized the child as 
guarded while the other examiner had not, al- 
though the latter judge explicitly questioned her 
own categorization in these six instances. There 
was no instance where one examiner judged a 
child as guarded while the other examiner clearly 
did not. 

The 13 cases in which there was complete 
agreement were largely LA (nine LA vs. four 
HA) and included seven children who were mem- 
bers of mismatched cases. Of the six cases where 
there was partial agreement about the child’s 
guardedness, four were HA and two LA, with 
only one a member of a mismatched pair. In 
other words, of the 12 pairs which the clinician 
had mismatched and about which he had felt 
great uncertainty in judging, eight contained a 
child who was guarded or defensive or with 
whom the examiner could not establish rapport. 
Six of the eight were LA cases labeled HA by 
the clinician. It would seem, therefore, that one 
of the sources of error in matching was the child 
who admitted to little or no anxiety on the two 
scales but whose guarded behavior in the Ror- 
schach situation was interpreted by the clinician 
as symptomatic of anxiety. From the inception 
of this project we had assumed that, if a child 
received high scores on both scales, we could 
place greater credence on his report than on that 
of the child who admitted to little or no anxiety. 
The analysis of the-mismatched supports this as- 
sumption. 

Criteria Employed by Clinician.—What follows 
here concerns those criteria which the interpret- 
ing clinician employed and could state in a way 
so as to make statistical testing possible. The 
interpretive process is too complex and too little 
understood and studied to expect that, at this 
point at least, one can do other than evaluate 
aspects of the process. It is easy for a clinician 
to say that he employed a criterion, but it is 
extremely difficult for him to indicate how he 
weights it when other criteria are present in vary- 
ing degrees. 


Rejection of cards. The inability to respond 
is perhaps the most blatant indicator of inter- 
ference in response. In the HA group 13 of 
the 32 cases rejected at least one card while in 
the LA group there were six such cases. A chi 
square analysis resulted in a p value of .05. (This 
and all subsequent p values are based on a one- 
tail test.) 

Minus response (F-/R). This measure reflects 
the degree to which an individual’s responses do 
not correspond to the stimulus area employed. 
In a chi square analysis in which the cut-off point 
was the median of the F-% distribution, signifi- 
cantly more HA were above the median 
(p = .025). When the vague responses were 
taken into account the differences between the 
groups disappeared, contrary to the clinician’s 
expectations. In other words, to the extent that 
the clinician attributed the same weight to giving 
many vague responses as to a high F-%, he was 
reducing his valid discriminations between the 
two anxiety groups. 

Total number of responses. The expectation 
that anxiety would interfere with output is, of 
course, similar to the expectation concerning card 
rejections. In a chi square analysis there was a 
tendency for HA subjects to give fewer responses 
(p = .05). When cases in the two anxiety groups 
who rejected cards were excluded from the anal- 
ysis, the probability level remains the same. 

Anatomy responses. It is assumed that the 
high anxious individual is one who, consciously 
or unconsciously, is or has been concerned with 
body adequacy or integrity, and that such con- 
cern will be reflected in the content of responses. 
The following are examples of anatomy re- 
sponses: skeleton, inside of somebody, X-ray, 
lungs, person’s breast, tonsils, etc. Thirteen HA 
individuals gave at least one anatomy response 
while six in the LA group did so (p = .05). For 
reasons similar to those above, it was expected 
that more HA than LA individuals would give 
responses in which somebody or something was 
explicitly killed or damaged. There was no dif- 
ference at all between the groups in this type of 
content. 

Active aggressive responses. It is assumed that 
the anxious individual tends to perceive himself 
as inadequate and has difficulty justifying the 
direct expression of aggressive feeling (3). On 
this basis it was expected that more LA than HA 
individuals would give responses involving fight- 
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ing, arguing, or a “volcano” percept. A chi 
square analysis indicated that more LA than HA 
individuals gave such responses (p = .05). 

The above results suggest that most of the 
criteria which the clinician could explicitly state 
had discriminatory value. 

Results with Conventional Scores—More HA 
than LA individuals were unable to respond at 
all to any objective property of the cards in- 
volving color (chromatic, achromatic, shading), 
the p value being .025. Seventeen of the 32 HA 
cases could not respond to any aspect of color 
while eight in the LA group could not do so. 
This finding, which was the only one of the con- 
ventional scores which attained or was near sig- 
nificance, can be integrated with some of the 
findings in the previous section in the following 
way: in a problem-solving situation containing 
relatively unfamiliar stimuli to which he must 
respond independently (i.e., deciding for one- 
self how, when, and how often to respond) the 
HA individual will, when he can respond at all, 
tend to reflect in his responsiveness illogical or 
irrational ways of thinking, and such responsive- 
ness will tend not to incorporate the obvious 
properties of the stimulus. This conclusion, 
which is akin to one by Cox and Sarason (2) in 
a similar study of college students, suggests that 
one of the effects of anxiety is to exacerbate to 
an interfering degree the role of internal and 
affective factors in outwardly directed respon- 
siveness. 

Analysis of Behavior Notes. — The first anal- 
ysis involved a count of those words (¢.8. afraid, 
uneasy, shy, disturbed, etc.) which might be 
symptomatic of anxiety. Some of these words 
appeared with too little frequency to warrant any 
conclusion, For any single word category there 
were no notable differences between HA and LA 
groups. When a count was made of the number 
of different categories in which an individual was 
entered, there was a tendency for HA boys to be 
entered in more categories than was the case for 
LA boys.” There was no such difference between 
HA and LA girls. The only other tendencies 
which seemed worthy of note concerned HA boys 
and HA girls: eight or 50 per cent of the HA 
girls were labeled as “nervous” by the examiner 
whereas only two of the HA boys were so desig- 


2In the analyses previously presented there were 
no sex differences which were significant or ap- 
proached significance. 


nated; 11 or-65 per cent of the HA boys were 
described as uncertain or unsure of themselves 
in this situation, whereas only three HA girls 
were so described. No such differences were 
found between LA boys and girls. It may be 
that a word like “nervous,” on the one hand, and 
words like “unsure” and “uncertain,” on the other 
hand, did not in fact signify different kinds of 
behavior. However, it is our impression from 
the context in which these words appeared that 
“nervous” signified a general impression of the 
examiner (e.g, “a nervous type of child,” “a 
nervous, high strung child”) whereas “unsure” 
and “uncertain” seemed to refer more specifically 
to reactions to the problem-solving task. 

In order to check further on the behavioral 
differences suggested above, a count was made 
of the number of questions which the child asked 
the examiner during the performance part of the 
Rorschach, that part during which the examiner 
was most nondirective and the factor of un- 
familiarity of the stimulus task presumably most 
strong. We were interested in such a count be- 
cause we assumed that the asking of questions 
reflected not only dependence but uncertainty as 
well. It was found that three HA girls asked 
questions whereas nine HA boys did so.. This 
suggests that the label “nervous” in connection 
with HA girls may well reflect behavior different 
from that signified by “uncertainty and unsure- 
ness” in HA boys. In the other anxiety group six 
LA girls and three LA boys asked questions. 
Again we found that the differences between HA 
and LA boys were greater than between LA and 
HA girls. 

A further analysis, relevant to the above, con- 
cerned the child’s handling of the tracing of each 
of his responses. The focus here was whether 
a child said he could not or did not know how 
to trace a response, or stated that he was not 
satisfied with what he had traced, or where the 
examiner explicitly stated that the child was con- 
cerned in some way with having to trace his 
responses. As in the previous analyses it was 
the HA boys who experienced difficulty with 
tracings: nine HA boys vs. four HA girls, and 
four LA boys vs. five LA girls. In going over 
the count it became apparent that some children 
had been described by the examiner as being con- 
cerned either if in tracing they tended to check 
their tracing with the blot or if they took back 
the tracing from the examiner in order to add 


46 CONTEMPORARY RESEARCH IN PERSONALITY 


something that had been forgotten. Since such 
behavior seems much less clearly indicative of 
concern than a child’s own statement of concern, 
a similar count was made excluding these cases. 
In the new count there were eight HA boys vs. 
two HA girls, and two LA boys vs. four LA girls. 
The tendency for differences to appear between 
HA and LA boys, but not between HA and LA 
girls, remained. 


DISCUSSION 


We have already indicated that we chose the 
Rorschach situation because it is one in which 
the child had to decide for himself how to handle 
a problem-solving task containing relatively un- 
familiar stimuli. We might put our intention in 
another way. We were attempting to increase the 
possibility that a child would experience a situa- 
tion as one of danger in that external support for 
the handling and fulfillment of his needs would 
be minimized or nonexistent. Because it is in 
such a situation that anxiety should be experi- 
enced, it was predicted that an HA group would 
be more adversely affected than would the LA 
group. We evaluate the findings we have pre- 
sented as supporting such a prediction. It seems 
appropriate to indicate at this point that the find- 
ings of the present study are similar to those ob- 
tained with HA and LA college students, despite 
the obvious differences in the nature of the two 
samples (2). 

One of the most thorny problems in the de- 
velopment of truly discriminating personality 
questionnaires, especially when the subject is 
asked to reveal what may be termed weaknesses, 
is that the subject can wittingly or unwittingly 
give an invalid response. The findings in the 
present study in relation to the “guarded” child 
underlines the importance of the problem but 
throws relatively little light on two questions. 
How can one pick out from questionnaires the 
child who admits to little or no anxiety on the 
questionnaire but whose overt behavior in relevant 
situations suggests the experience of anxiety? 
How can one discriminate between the child who 
consciously gives an invalid low score and the 
one whose perception and remembrance of his 
test behavior is subject to defensive distortion? 
In an attempt to explore this problem a “Tie” 
scale consisting of 11 items was developed and 
embedded in the general anxiety scale. Each 
of these 11 items was specifically concerned with 


the lie tendency in relation to anxiety or worry 
(e.g., “I have never had a scary dream,” “Do you 
ever worry?”). The correlation between lie and 
anxiety scores tended to be of such size (ranging 
from —.65 to —.75) that one could not assume 
that the two scores were measuring different 
things. This finding is markedly at variance 
from that of Castenada, McCandless, and Palermo 
(1) who found practically no correlation between 
anxiety and lie scores. It is important to point 
out that whereas our lie items specifically con- 
cerned anxiety their items were more omnibus 
in nature. It is our opinion that it is erroneous 
to assume that the lie or defensive tendency is 
not dependent on the content area (e.g., anxiety, 
aggression, cheating) about which one is ques- 
tioning the subject. If the lie items concern 
cheating or kindness, and the scale in which they 
are embedded concerns anxiety, there is no com- 
pelling a priori reason to expect a significant cor- 
relation between the lie and anxiety scales. It 
would seem that until more thought is given to 
the theoretical rationale and methodology of the 
measurement by questionnaire of different kinds 
of defensive tendencies the clinical and research 
utility of personality questionnaires will be 
limited. 

An important problem suggested by this study 
stems from the suggestive trend in the behavior 
notes, namely that the overt behavior of HA boys 
seems to reflect anxiety differently from that of 
HA girls. In addition, the overt behavior of HA 
boys seemed different from that of LA boys 
whereas no such trend was discernible between 
HA and LA girls. Although these trends may be 
unreliable because of the small number of cases 
upon which they are based, we presented the be- 
havioral data because of their possible significance 
for other of our findings. Twenty-four of the 
32 matched pairs of subjects were also utilized 
in a learning study (8). In that study significant 
differences between HA and LA subjects were 
found but the difference between the HA and 
LA boys was notably larger than that between 
HA and LA girls. In addition, although girls 
rather consistently get significantly higher anxiety 
scores than boys, the correlation between anxiety 
score and score on conventional group tests of 
intelligence is no higher among girls than among 
boys. One possible implication of these different 
findings is that a high anxiety score does not have 
the same psychological significance in a boy as 
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in a girl. We have elsewhere (7) advanced the 
hypothesis that our culture makes it much more 
difficult for a boy than for a girl to admit ex- 
plicitly to anxiety (or weakness). For a girl 
to admit to anxiety is not as likely to impair her 
own or others’ evaluation of her femininity as a 
similar admission in a boy would impair his own 
or others’ evaluation of his masculinity. If this 
is correct, one might expect that the anxiety 
which the HA girl admits to on the questionnaires 
does not have as interfering an effect on her 
problem-solving behavior as in the case of the 
HA boys. The fact that boys get lower anxiety 
scores than girls—which we interpret as a reflec- 
tion of a kind of learned defensiveness or suppres- 
sion in boys—would suggest that the boy who 
does get a high score is one who has difficulty 
handling his anxiety. The finding in the present 
study concerning the dependent behavior of HA 
boys fits in with such a hypothesis. 


SUMMARY 


The Rorschach was administered to 32 high 


and 32 low anxiety children who were matched 
for grade, sex, and IQ score. A relatively brief 
and blind clinical analysis (case-by-case) by a 
clinician resulted in significant discriminations 
between the two anxiety groups. In a matched- 
pair analysis the discrimination was not signifi- 
cant. Further analysis revealed that one of the 
sources of error in matching was the child who 
admitted to little or no anxiety on the anxiety 
scales but whose guarded behavior in the Ror- 
schach situation was interpreted by the clinician 
as symptomatic of anxiety. 

In contrast to the low anxious subjects, the 
high anxious rejected more cards, gave fewer 
responses, gave fewer responses with aggressive 
content, gave more responses with anatomy con- 
tent, and responded less to any aspect of color 
(chromatic, achromatic, shading). An analysis 
of the behavior notes suggested that the HA boys 
are more easily differentiated from the LA boys 
than are the HA girls from the LA girls. The 
HA boys present a rather clear behavioral picture 
of indecision and dependence. 

The results of this study are discussed in terms 
of (a) their implications for the measurement of 
defensive tendencies as they affect questionnaire 
taking and (b) the possibility that identical scores 
(e.g., a high score) in boys and girls have differ- 
ent psychological significances. 
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EFFECTS OF VERBAL AND 
NONVERBAL REINFORCEMENT 
IN THE RORSCHACH * 


LEONARD R. Gross’ 


While there is some awareness of the dynamic 
interactions between examiner (E) and subject 
(S), the problem of just how and to what extent 
test results are influenced by these interactions 


* Reprinted by permission from the Journal of 
Consulting Psychology, February, 1959, Vol. 23, No. 
1, 66-68. 

1 The author wishes to acknowledge his indebted- 
ness to W. J. Eichman and B. M. Smith of the 
Roanoke, Virginia, Veterans Administration Hospital 
for their guidance and assistance. 
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have not been fully explored. It has been demon- 
strated that Rorschach responses can be in- 
fluenced by orientational sets stemming from 
pretest practice (3), pretest suggestion (7), and 
conscious instructions (2), as well as by E dif- 
ferences (4), but there are also reasons to be- 
lieve that the E’s actions throughout the testing 
situation may affect Rorschach performance (6). 

It would be desirable to show that different 
cues given by the E are responded to without 
conscious awareness by the S, resulting in a 
changed Rorschach protocol. This study at- 
tempted to reinforce general human content on 
the Rorschach, using the verbal reinforcer good 
and the nonverbal reinforcer nodding. The Ss 
were randomly selected psychiatric patients. 

On the basis of previously mentioned studies 
it can be hypothesized that: (a) the verbal re- 
inforcer good will increase the frequency of the 
reinforced responses over that of a control group; 
(b) the nonverbal reinforcer nodding will in- 
crease frequency of the reinforced responses over 
that of a control group; and (c) the verbal 
stimulus will be more effective than the non- 
verbal stimulus in increasing the reinforced re- 
sponses. 


METHOD 


The Ss were selected from the psychiatric sec- 
tion of a university hospital on the following 
bases: (a) no history of organic brain damage, 
(b) a minimum of tenth grade education, and 
(c) no previous Rorschach experience. They 
were randomly selected from both the inpatient 
and outpatient services and placed in one of three 
groups in the following prescribed order: verbal 
reinforcement group (VR), nonverbal reinforce- 
ment group (NVR), or control group (C). 

The Ss were excluded from the study if they 
did not meet both of the following criteria: (a) 
give at least one response involving general hu- 
man content (humans, human-like creatures, hu- 
man anatomy) during the first two cards, and 
(b) give three responses per card for all ten 
cards. 

Out of the 46 Ss tested, 6 did not meet the 
criterion of three responses per card, and 10 
failed to produce one human response within the 
first two cards, leaving a total of 30 Ss, with 10 
Ss in each group. The mean age of all the Ss 
was 34, with a range of 17 to 53. The sexes 


were evenly divided. The diagnoses were mixed 
and included neurotics, character disorders, and 
psychotics. The mean level of education for the 
30 Ss was the twelfth grade, with 11 Ss having 
some college education. There were no signifi- 
cant differences in educational level among the 
three groups. 

The Ss were presented with the complete set 
of Rorschach cards in the standard procedure 
for the free association with one variation in the 
Beck instructions, i.e., the inclusion of a sentence 
requesting three responses per card. 

In the VR group, the E said “good” after each 
general human response. In the NVR group, the 
E nodded his head once after each human re- 
sponse. In the C group, the cards were admin- 
istered with the attempt not to offer any cues. 

Posttest interviews revealed that none of the 
Ss verbalized any awareness of the nature of the 
study. 


RESULTS 


The mean number of general human responses 
were compared for the three groups. As a result 
of heterogeneity of variance, a square root trans- 
formation was applied to the raw scores of the 
individual Ss. The means and variances of the 
transformed scores as well as the raw scores are 
found in Table 1. The analysis of variance of 


TABLE 1 


Raw AND TRANSFORMED SCORES OF 
THE THREE GROUPS 


Ban Scores Square Root 
Transformation 
Group 
Mean Var. Mean Var. 
VR 8.7 11.34 2.86 ANY 
NVR 10.3 27.12 3.09 59 
C 6.3 6.77 2.39 .33 


the transformed scores for the effect of reinforce- 
ment yielded an F significant at the .06 level. 
The groups were compared with each other by 
means of individual ? tests. One-tailed £ test was 
used as the direction was previously specified. 
The VR group gave more human responses than 
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the C group at the .05 level of significance. The 
NVR group yielded more human responses than 
the C group at the .02 level of significance. The 
VR and NVR groups did not differ significantly 
from each other. 


DISCUSSION 


The results suggest that nodding or saying 
“good” will increase the frequency of pre-selected 
content responses in a Rorschach situation. 

The first two hypotheses that the verbal rein- 
forcer good and the nonverbal reinforcer nodding 
will increase the frequency of the reinforced re- 
sponses appear to be substantiated. The third 
hypothesis that the verbal stimulus will be more 
effective than the nonverbal stimulus in increas- 
ing the reinforced response is rejected. The latter 
result is in confliet with previous studies using 
nonverbal cues (5). A possible explanation is 
that the use of a flashing light as a nonverbal 
reinforcer in other studies was not perceived as 
part of the testing situation, while nodding was. 

The most obvious implication of the results is 
that one cannot discount even minimal or un- 
conscious cues of the E when analyzing a Ror- 
schach protocol. It follows that interpretations 
of test responses and test behavior should not be 
considered separately but in light of the total 
situation. Both the E’s behavior and the S’s con- 
ception of the testing situation are of import. 

While the results in this study are suggestive 
of some of the variables involved in the com- 
plicated interaction between tester and testee or 
interviewer and interviewee, it remains for future 
research to make experimentally clear other vari- 
ables operating in such situations. It is probable 
that other aspects of Rorschach responses can be 
reinforced in a similar manner, but exactly which 
responses occur in sufficient number to be rein- 
forced as well as how much of a variation in E 
behavior is necessary to produce a change in a 
response level has not been answered. Another 
question is just how important are the variables 
that can be affected. It seems that if classical 

, 


scoring methods are used these variables can 
have considerable effect on the dynamic picture. 


SUMMARY 


The main interest was interpersonal relations 
in a clinical situation. The study was designed 
to test the general hypothesis that E-S interaction 
is an important variable in test results. Thirty 
psychiatric patients were randomly selected and 
administered the free association of the Ror- 
schach with the standard instructions modified so 
as to elicit three responses per card. The Ss 
were then presented with either verbal reinforce- 
ment good, nonverbal reinforcement nodding, or 
no reinforcement, whenever they gave a general 
human content response. It was found that both 
the VR and NVR groups gave significantly more 
of the reinforced responses than the C group. 
There were no significant differences between 
the two types of reinforcement. The findings 
suggest that cues given by the E can affect re- 
sponse categories. The necessity for avoiding 
interpretations of test protocols in vacuo was 
discussed. 


REFERENCES 


1. Abramson, L. S. The influence of set for area 
on Rorschach test results. J. consult. Psychol., 
1951, 15, 337-342. 

2. Henry, Edith M., and Rotter, J. B. Situa- 
tional influences on Rorschach responses. J. 
consult. Psychol., 1956, 20, 457-462. 

3. Leventhal, H. The influence of previous per- 
ceptual experience on the variance of the Ror- 
schach W and Z scores. J. consult. Psychol., 
1956, 20, 93-98. 

4. Sanders, R., and Cleveland, S. E. The rela- 
tionship between certain E personality variables 
and S’s Rorschach scores. J. proj. Tech., 1953, 
17, 34-50. 

5. Taffel, C. Anxiety and the conditioning of 
verbal behavior. J. abnorm. soc. Psychol., 
1955, 51, 496-501. 

6. Wickes, T. A., Jr. 
testing situation. 
20, 23-26. 


Examiner influence in a 
J. consult. Psychol., 1956, 


50 CONTEMPORARY RESEARCH IN PERSONALITY 


THE PROJECTIVE EXPRESSION 
OF NEEDS. IV. THE EFFECT OF 
THE NEED FOR ACHIEVEMENT 
ON THEMATIC APPERCEPTION * 


Dav C. MCCLELLAND, RUSSELL A. CLARK, 
THORNTON B. ROBY, AND JOHN W. ATKINSON z 


Previous experiments in this series have been 
concerned with establishing principles for the 
interpretation of projective behavior. The 
method has been to note changes in perception 
(13) and apperception (2) resulting from differ- 
ent intensities of the hunger drive. A number of 
shifts in perception and in the thematic content 
of stories have been established which provide 
important clues for the detection of the strength 
of the hunger drive from projective records. 

But the crucial experiment in the series remains 
to be performed. No one is particularly inter- 
ested in diagnosing hunger from projective re- 
sponses. The point is, do the same kinds of shifts 
occur for an experimentally controlled psycho- 
genic need, or are the clues which have been 
discovered applicable only to some simple physio- 
logical tension like hunger? 

The present experiment was designed to answer 
this crucial question. It was decided to choose 
a psychogenic need which could be aroused ex- 
perimentally and to see whether it produced per- 
ceptive and apperceptive changes similar to those 
already noted for hunger. The need chosen was 
“need achievement” or “need mastery,” the need 
which presumably is aroused by experimentally 
inducing ego-involvement, according to a tech- 
nique which by now is fairly well standardized 
among psychologists experimenting in the field 
of personality (Z, 16, 17). The word “presum- 
ably” is used advisedly. No one knows for certain 
that there is a unitary n Achievement? which 


* Reprinted by permission from the Journal of Ex- 
perimental Psychology, April, 1949, Vol. 39, No. 2, 
242-255. 

1 This project was carried out at Wesleyan Univer- 
sity and was made possible by a grant from the 
Office of Naval Research for which the authors are 
very grateful. 

2 The convention adopted by Murray (15) of short- 
ening need to n will be followed throughout this 
paper. 


can be satisfied by success and aroused by failure 
in the same way that hunger is satisfied by food 
and aroused by deprivation of food. However, 
if manipulation of the conditions of ego-involve- 
ment produces the same kinds of effects on pro- 
jection as manipulation of hours of food depriva- 
tion, there will be some basis for considering the 
psychogenic state aroused as a need, at least to 
the extent that it functions like a physiological 
one. It was to establish this kind of parallelism 
of function that work began in this series with 
a simple physiological tension which nearly every- 
one would accept as a need or drive. Conse- 
quently, if the results in this experiment are in 
substantial agreement with those obtained in 
earlier ones in the series, it will provide evidence 
for the existence of higher order psychogenic 
needs which at least function like those at a 
simpler physiological level. 

One of the crucial problems in this type of 
experiment is to find a scoring system for thematic 
stories which is objective enough to provide high 
observer agreement and sensitive enough to re- 
flect changes in motivational states. So a further 
purpose of this experiment is to develop the scor- 
ing system further which was found useful for 
hunger (2) and to test its applicability to a more 
complex psychogenic need. The standardization 
of an objective scoring system for projective 
records should ultimately make possible some 
general principles for interpreting them. What 
is even more important, it should open up for 
experimentation the whole field of imagination 
which has been more or less neglected, except 
by the clinicians, since introspection was discred- 
ited as a fruitful approach to arriving at psycho- 
logical principles. 


PROCEDURE 


The materials used in the experiment consisted 
of some simple paper and pencil tests and some 
slides for thematic apperception. There were 
seven short tests: anagrams (4 min.), scrambled 
words I (3 min.), scrambled words II (4 min.), 
and four motor perseveration tests (4) in each 
of which the subject performed a writing task 
as often as possible in the normal manner for 
one min. and then backwards or in some unusual 
manner for one min. The total time taken in- 
cluding pauses between tests for instructions was 
about 25 min. The tests were chosen on the 
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basis of past experience (4, 5) for a factor anal- 
ysis, in connection with which they will be de- 
scribed in detail (6). Their chief function here 
was to provide the basis for inducing ego-involve- 
ment. ‘That is, there were two main funda- 
mentally different conditions under which the 
tests were administered. In one (hereafter re- 
ferred to as relaxed), the test administrators were 
introduced by the instructor at a regular class 
session as some graduate students who were 
trying out some tests. This orientation was re- 
inforced by further remarks by the “graduate 
student” to the effect that these tests had been 
recently devised, were still in the developmental 
stage, and that data were being collected in 
order to perfect them. Throughout, the em- 
phasis was clearly on the fact that the experi- 
menters were interested in testing the tests and 
not the students. These instructions were de- 
signed to create an easy relaxed atmosphere in 
which the need for achievement was at a min- 
imum. 

In the other main condition (hereafter referred 
to as failure), the administration of the prelim- 
inary tests was quite different. After the ex- 
perimenters had been introduced to the class by 
the instructors they began passing out the test 
booklets with no explanation as to the purpose 
of the experiment. The only remarks made dealt 
with the first test-anagrams and the necessity of 
paying close attention to directions as the tests 
were timed, After completing the first test, the 
subjects calculated and recorded their scores on 
it. Then they filled out a short questionnaire 
which asked for: name, high school and college 
attended with estimated class standing in each, 
IQ (if known), and an estimate of their general 
intelligence (above average, average Or below 
average). The purpose of the questionnaire was 
only incidentally to obtain information. It was 
primarily to get a subject ego-involved in the 
situation by making his test score known to 
himself and outsiders in relation to a lot of 
other achievement-related facts about him. 

This aim was further supported by the follow- 
ing remarks then made by one of the experiment- 
ers (RAC) given from memory so as to give the 
impression of spontaneity: 

The tests which you are taking directly indi- 


cate a person’s general level of intelligence. 
These tests have been taken from a group of 


tests which were used to select people of high 
administrative capacity for positions in Wash- 
ington during the past war. Thus in addition 
to general intelligence, they bring out an indi- 
vidual’s capacity to organize material, his abil- 
ity to evaluate crucial situations quickly and 
accurately; in short, these tests demonstrate 
whether or not a person is suited to be a leader. 

The present research is being conducted for 
the Navy to determine which educational insti- 
tutions turn out the highest percentage of stu- 
dents with the administrative qualifications 
shown by superior scores on these tests. For 
example, it has been found that Wesleyan Uni- 
versity excels in this respect. You are being 
allowed to calculate your own scores, SO that 
you may determine how well you do in com- 
parison with Wesleyan students. 


At this point the experimenter quoted norms 
for Test I that were so high that practically every- 
one in the class failed and placed in the lowest 
quarter of the Wesleyan group. It was then ex- 
plained that Test I was the single most diagnostic 
test in the battery and thus an individual’s stand- 
ing on this first test would be a good indication 
of how well he might expect to do on the test 
as a whole. The rationale of giving these instruc- 
tions after the first test rather than before was 
partly to place their first scores near the actual 
comparison with the norms and partly to provide 
a basis for testing the initial comparability of 
the groups in another part of the experiment (6).* 

After this the subjects went on and completed 
all the paper and pencil tests. At the end they 
added up individual test scores to obtain a total 
score and again were given falsely high norms 
“so that they could see how well they had done 
as a whole in comparison with Wesleyan stu- 
dents.” 

That these instructions succeeded in producing 
very different effects on the students was obvious 
to even the most casual observer. Under the 
ego-involving instructions, they worked hard and 


3 It goes without saying that the instructions con- 
tained nothing that was true, or for that matter 
nothing that was completely false. The references to 
the Navy and to the Washington administrators (cf. 
18) were all partly true and could be checked by 
over-curious psychology students. Every other pre- 
caution was taken to prevent leaks about these in- 
structions since they were crucial to the whole experi- 
ment. Different classes at the same institution were 
run on the same day and the procedure was not “ex- 
posed” to the students at the end or even to the 
instructors (whose kindness in cooperating under the 
circumstances is greatly appreciated). 
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quietly, and there were various obvious expres- 
sions of dismay when the norms were announced. 
There was no indication that the instructions 
were disbelieved. Under relaxed conditions the 
subjects were as a whole more relaxed and gave 
the impression of enjoying the tasks as they 
would a series of parlor games. 

At the conclusion of the paper and pencil tests 
used to arouse different need states the other 
experimenter (JWA) read the following instruc- 
tions: 


This next test is a test of your creative imagi- 
nation. A number of pictures will be projected 
on the screen before you. You will have 20 
seconds to look at the picture and then five 
minutes to make up a story about it. Notice 
that there is one page for each picture. The 
same four questions are asked. They will guide 
your thinking and enable you to cover all the 
elements of a plot in the time allotted. Plan to 
spend about a minute on each question. I will 
keep time and tell you when it is about time to 
go on to the next question for each story. You 
will have a little time to finish your story before 
the next picture is shown. 

Obviously there are no right or wrong an- 
swers, so you may feel free to make up any 
kind of a story about the pictures that you 
choose. Try to make them vivid and dramatic, 
for this is a test of creative imagination. Do 
not merely describe the picture you see. Tell 
a story about it. Work as fast as you can in 
order to finish in time. Make them interesting. 
Are there any questions? If you need more 
space for any question use the reverse side. 


In addition to the two major conditions of 
administration so far described, four other at- 
tempts were made to arouse a third intensity of 
n Achievement which would provide the three 
points desirable for establishing trends and for 
making the data comparable with the hunger 
experiment (2). In the first place a success 
group was created by announcing norms after the 
first test and at the end which were so low that 
all or nearly all the students succeeded as com- 
pared with Wesleyan students. This was sup- 
posed to satiate n Achievement; but a prelimi- 
nary analysis of the results indicated that while 
this seemed to be partly true, the need aroused 
by the ego-involving instructions persisted into 
the subsequent thematic apperception test which 
was interpreted as a further test of ability. 
Consequently, the position of this group on the 
need continuum was not clear and the results 
from it did not seem worth reporting here. Sec- 


ondly, there was a simple ego-involved group in 
which no norms were announced. It was ex- 
pected that these Ss would reflect an aroused 
n Achievement which would be purer and not 
contaminated by recent experiences of failure or 
success. However, the stories written under this 
condition were so tense, inhibited, and cautious 
(cf. 14) that it proved difficult to analyze them, 
and the results from this group will also not be 
reported here. 

The results from the final two variants on the 
major conditions proved meaningful and will be 
reported. In the first of these a group of Wes- 
leyan students was run in a neutral but not “re- 
laxed” atmosphere. That is, they were task- 
oriented rather than ego-oriented (cf. I), but they 
were asked- to cooperate seriously and to work 
hard on the tasks so that adequate norms for 
them could be established. The reason for these 
instructions was to get a somewhat higher 
n Achievement tension than under the relaxed 
condition in order to maximize individual differ- 
ences as part of another experiment (6). In the 
final group an attempt was made to get an in- 
tenser n Achievement aroused, by giving the Ss 
a taste of success by quoting low norms after the 
first test followed by an even greater failure at the 
end induced by quoting high norms. This will be 
referred to as the success-failure group. 

There was no indication in the ego-involved 
groups that the projective tests were not still part 
of the program of testing for administrative abil- 
ity. The slides used for eliciting the written stor- 
ies consisted of two especially chosen for this 
experiment (two men in overalls looking or work- 
ing at a machine; a young man looking into space 
seated before an open book), followed by two 
taken from the Murray Thematic Apperception 
Test (“father” talking to “son” —TAT 7 BM; boy 
and surgical operation—TAT 8 BM). The pic- 
tures were chosen to suggest achievement—either 
at a specific task or general level and in school- 
related and unrelated situations. 

The Ss were all male, a majority veterans, and 
all college students taking various psychology 
courses at the University of Connecticut (at 
Storrs and the Ft. Trumbull extension); New 
Britain State Teachers College, Trinity College, 
and Wesleyan University. They were run in 
regular class room periods either in the summer 
of 1947 or the spring of 1948. The entire testing 
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time which included some tests of perceptual in- 
ference reported elsewhere (14) took from 70- 
80 min., except for the success-failure condition 
in which it was necessary to cut out the last three 
of the motor perseveration tests to finish within 
a normal 50-min. class period. 

Scoring—The stories were scored according 
to the same general system used in the hunger 
experiment (2) with additions and modifications 
necessitated by the more complex nature of the 
need involved. Detailed scoring criteria cannot 
be given here for lack of space, but they have 
been reported in full elsewhere (12). The fol- 
lowing brief descriptions will serve at least to 
identify the major categories used. 

g, t, or u I: Achievement imagery is scored 
either general (g 1) or task (t I); stories with 
no achievement imagery are scored as unrelated 
(u I). To be general, achievement imagery must 
deal with some long term problem of getting 
ahead at the ego ideal level (career, schooling, 
inventing something, ete.). Everything else, par- 
ticularly, the specific task situation, was classified 
as t I. 

Ach or D th: Themas or plots are scored if 
the achievement imagery is central to the story. 
If the plot is concerned with someone who is in 
an achievement difficulty which has or is antici- 
pated as having serious long term effects, it is 
scored as deprivation thema (D th); otherwise it 
is an achievement thema (Ach th), though there 
may be many difficulties in the way of the goal. 

d p or w: Deprivations or blocks in the path 
of progress or indications of past failures, i.e., 
things not running smoothly. If the trouble is 
with the person himself it is scored d p; if it is 
with the world it is scored d w. D th was not 
scored for d also unless there was some second- 
ary and separate source of hindrance. 

N p or g: Need for achievement is stated in 
the story either at the personal level (“He wants 
to be a doctor”) or at the general level (“He 
wants to serve humanity”). 

I+, —, or o: Instrumental activity which is 
either successful (I+) unsuccessful (I—) or of 
doubtful outcome (I o). The person in the story 
must do something (even if only think or de- 
cide) about achieving his goal which is separate 
from the statement of the situation and the state- 
ment of outcome: e.g., “the boy graduates from 


school” is scored for outcome but considered too 
passive to represent instrumental activity. 

Ga+, —, or o: Anticipations of outcomes (goal 
responses) which may be either of success (Ga+, 
“He is thinking of the day when he’ll be famous”) 
or failure (Ga—, “He is worried about what will 
happen”) or neither (Ga o, “He is wondering 
what will happen”). 

nu or ho P: Nurturant or hostile press. Some 
person in the story is either actively helping or 
hindering the person working for achievement. 
The hindrance must be more hostile than a static 
block (see d w above). 

S: Substitution. A person who meets with an 
obstacle in his achievement instigation-action se- 
quence adopts a substitute instrumental act or 
substitute goal response (“He drowns his sor- 
rows in a tavern”). 

G or G'+ or —: Goal responses which occur 
either within the story (G) or at the end (G’) 
and which may be either positive affect (“He was 
happy in his new job”) or negative affect (“The 
boy is worried over having flunked his exam”). 

O+, —, or o: Outcomes of the whole story are 
judged according to whether they are happy 
(O+), unhappy (O-), or doubtful (O o). 
Finer breakdowns were made but did not prove 
useful. The total outcome was not necessarily 
the same as that for the instrumental activity and 
was also separate from the final affect (G’). For 
example, “They fixed the machine” is scored O+ 
but not G’+, because it doesn’t say they were 
pleased about it. 

The following story illustrates how the scoring 
was used. After each word or phrase scored is 
written in parenthesis the scoring symbol appli- 
cable: 

1. What is happening? Who are the persons? 
— "The boy is being talked to by his father, 
maybe something about what has happened in 
school, or he may be planning to get married.” 

2. What has led up to the situation—that is, 
what has happened in the past?—“He may have 
flunked out of school (D th) and is being lec- 
tured on what is expected of him (nu P).” 

3. What is being thought—what is wanted? 
By whom?—‘The father wants the boy to make 
good (N p), he is thinking that he wants his son 
to follow in his footsteps and make good in life 
(Ga+, gD.” 

4. What will happen? What will be done?— 
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“The boy will do his best, will go back to school 
because he has learned a lesson he will never 
forget (I+). He will make good this time and 
be a success (O+).” 

As this example shows, the scoring was not 
done from the viewpoint of a single character 
with whom the writer supposedly identified, al- 
though this is the usual method of procedure. 
Thus the father’s wish for the son’s success is 
scored N p (father’s viewpoint) and the father’s 
help is scored nu P (son's viewpoint). The 
rationale for this was the conviction that deter- 
mination of the person with whom the writer 
identified was often difficult and that it was not 
necessary—e.g., in this instance it is just as pos- 
sible to suppose that the writer is projecting his 
wish to do well into the father figure as into the 
son. Note also that the second statement of the 
father’s wish is not scored again. A given cate- 
gory is scored only once per story no matter how 
many times it appears. 

This example shows how decisions on the 
scoring of a specific item were affected by the 
total context and by the scoring of other items. 
Thus, it is not until the whole story is read that 
the thema is clearly one of achievement and not 
marriage, and it is not until the third paragraph 
that the decision can be made that the imagery 
is general and the thema one of deprivation. The 
factors which lead to this decision are the pres- 
ence of Ga+ and N p (see above definitions). 
It was recognized that the interdependence of 
scoring categories was not wholly desirable from 
the statistical viewpoint, but it soon became 
obvious that the interdependence existed at the 
intuitive level for categories like g I and D th, no 
matter how discretely the definitions might be 
drawn. Hence, it seemed best to state as ex- 
plicitly as possible any other categories that were 
usually taken into account in the normal process 
of arriving at a judgment on a given category. 

The way in which all the stories were scored 
will be described in full under results in the sec- 
tion on reliability of the scoring. It involved two 
judges working together without knowledge of 
which of three groups (neutral, failure, and suc- 
cess-failure) the stories belonged to. 


RESULTS 


The main results of the experiment are shown 
in Tables 1, 2, 3, and 4, which summarize the 


TABLE 1 


THE NUMBER AND PERCENTAGE OF 
ACHIEVEMENT-RELATED STORIES 
WRITTEN UNDER RELAXED, FAILURE, AND 
Success-FAILURE CONDITIONS 


The number of stories in each condition is 156 
(39 Ss X 4 stories) 


Success- 


Relaxed Failure 


Chi- Failure 
square 
N| %|N| % N | % 
Imagery 
task, tI 73 | 46.8 | 56 | 35.9 3.82 | 47 | 301 
general, gl 26 | 16.7 75 | 48.1 35.16 85 | 54.5 
unrelated, uI | 57 | 36.5 | 25 | 16.0 | 16.94 | 24 | 15.4 


Chi-square is 3.84 and 6.64 at the .05 and .01 levels of significance, 
respectively. 


TABLE 2 


THE NUMBER AND PERCENTAGE OF THE 
ACHIEVEMENT-RELATED STORIES 
WRITTEN UNDER DIFFERENT CONDITIONS 
SHOWING VARIOUS STORY CHARACTERISTICS 
RELATED TO THE DESCRIPTION OF THE SITUATION 


Success- 


Relaxed Failure rune 

Number of 14 131 Chi- 132 
Stories square 

„|%|n|% N|% 

Plot Achth | 59 | 59.6 | 83 | 63.4 98 | 742 

Dth 6 | 61 | 25 | 19.1 | s41 | 14 | 106 

Obstactesd p | 12 | 12.1 | 23 | 17.6 24 | 18.2 

dw| 22 | 222 | 17 | 130 | 341 | 21 | 159 


Chi-square is 3.84 and 6.64 at the .0S and .01 levels of significance, 
respectively. 


frequency of appearance of various scoring cate- 
gories for the relaxed, failure, and success-failure 
conditions. The results from the neutral condi- 
tion, which generally fell between the relaxed and 
failure conditions, will be reported only in sum- 
mary form. In all the tables increases or de- 
creases from the relaxed to the failure condition 
are tested for significance by means of chi- 
square.* Table 1 shows that there is a large and 


4 It should be noted that the population considered 
here is number of stories, rather than, as in the 
hunger experiment, the number of Ss showing & 
characteristic at least once. The latter measure, 
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TABLE 3 


THE NUMBER AND PERCENTAGE OF THE 
ACHIEVEMENT-RELATED STORIES 
WRITTEN UNDER DIFFERENT CONDITIONS 
SHOWING VARIOUS STORY CHARACTERISTICS 
RELATED TO THE CHARACTERS’ 
REACTION TO THE SITUATION 


Relaxed | Failure et 
Number of es 1 Chi- 132 
Stories square 
N| IN|% N| % 
Need stated: 
N p &/or g 21 | 21.2 | 58 | 44.3 | 13.30 | 64 | 48.5 
Instrumental 
acts: I+ 9| 9.1 |41 | 31.3 | 16.29 | 31 | 23.5 
I- A | 6| 45 
Io 6| 61] 10] 7.6 7| 53 
Anticipatory goal re- 
sponse: 
Ga+ &/orGa— | 15 | 15.2 | 47 | 35.9 | 12.33 | 60 | 45.5 
Gao 6| 61} u| 84 15 | 11.4 


Chi-square is 3.84and 6.64 at the .05 and .01 levels of significance, 
respectively, 


very significant increase in the number of stories 
dealing with general or long term achievement 
while there is a decrease in the number of stories 
with no achievement imagery and of those with 
task achievement imagery. 

Because of this shift the method of computing 
percentages in Tables 2, 3, and 4 has been 
changed. Since there were significantly more 
stories with achievement imagery in the two 
failure conditions, there is a correspondingly 
greater opportunity for other achievement-related 
characteristics to appear in these conditions. But 
the important question is: given an achievement 
story, are there significant differences in its in- 
ternal characteristics when written under different 
conditions? Hence, the results for further char- 


while easier to interpret statistically, is not as appli- 
cable to the data of this experiment because of the 
much greater frequencies obtained for many of the 
need-related categories, The authors realize that a 
chi-square test of significance applied to repeated 
measures from the same Ss is hard to interpret be- 
cause of the peculiar nature of the universe to which 
the inference is made, but have decided to use it for 
two reasons: (1) other statistics appear to have even 
more serious objections, and (2) the differences found 
to be significant for all four pictures and used in 
calculating the final n Achievement score were also 
significant when only the results from the single most 
diagnostic picture (TAT 7 BM) were uséd. 


acteristics are presented as percentages not of all 
stories but only of the achievement stories in each 
condition. 

Table 2 shows only a significant increase in 
the number of deprivation themas. A compari- 
son of the failure with the success-failure results 
suggests that the former reflect sensitively the Ss’ 
greater failure experience since the combined 
thema totals are nearly identical for the two con- 
ditions (82.5 and 84.8 per cent respectively), 
both being considerably larger than the same 
figure for the relaxed condition (65.7 per cent). 

Table 3 shows a larger number of significant 
shifts in the forward-looking, striving aspects of 
the stories. This table indicates that an aroused 
n Achievement increases the likelihood that char- 
acters in the story will be described as wanting to 
get ahead (N p), as doing something successful 
about getting ahead (I+), and as thinking in 
advance about success or failure (Ga+ or Ga—). 
In Table 4 the shifts appear in the number of 
people seen as aiding or hindering achievement 
(nu or ho P) and in the frequency with which 
positive affect is specifically mentioned, either in 
the course of the story or at the end (G or G'+). 
It is interesting to note that there are no signifi- 
cant changes in the outcome category, despite the 
fact that most of the present a priori systems for 
scoring the TAT (3, 15, 20) lay emphasis on this 


TABLE 4 


THE NUMBER AND PERCENTAGE OF THE 
ACHIEVEMENT-RELATED STORIES 
WRITTEN UNDER DIFFERENT CONDITIONS 
SHOWING VARIOUS STORY CHARACTERISTICS 
RELATED TO THE OUTCOME OF THE SITUATION 


Relaxed | Failure en 
Number of il am Chi- 132 
Stories square 
N|%| | % % 
Press: ho P &/or 
nu P 10 | 10.1 | 29 | 221 | 5.82 15.2 
Substitution: S | 2] 20] ıı | 84 | 320° 5.3 
Goal response: 
G&/orG’+ | 4] 40 | 33 |252| 18.61 23.5 
G &/or @'— | 18 | 18.2 | 34 | 26.0 24.2 
Outcomes: o+ | 47 | 47.5 | 63 | 48.1 52.3 
o— |16 | 16.2 | 12 | 92 18.2 
oo |36 | 36.2 | 56 | 428 29.5 


@ Corrected for continuity. 


Chi-square is 3.84 and 6.64 at the .05 and .01 levels of significance, 


respectively. 
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characteristic, and despite the fact that a far 
more elaborate breakdown of different types of 
endings was actually made than is reported here. 
There is one shift in outcomes, however, for the 
success-failure group, which gives significantly 
fewer doubtful outcomes to its stories than does 
the failure group. This suggests that repeated 
failure may cause an unwillingness to state the 
outcome of an achievement sequence, especially 
an unfavorable (o —) outcome, since this is the 
category which is reduced in the failure group. 

A tabulation of the frequency of appearance of 
each story characteristic for each S was made as 
a basis for obtaining a single summary n Achieve- 
ment score. The characteristics which showed a 
significant increase in Tables 1-4 from the re- 
laxed to the failure condition were scored +1 
and those which decreased were scored —1.° 
Thus, there were seven positive characteristics 
(g I, D th, N, I+, Ga+ or —, nu or ho P, and 
G or G'+) and two negative characteristics (t I 
and u I). The results from the success-failure 
group were not taken into account in developing 
the scoring system because it was felt that the 
need state aroused might be more complex than 
in the straight failure group. 

Table 5 presents the mean n Achievement 
scores for each condition. The means for the 


TABLE 5 


MEAN n ACHIEVEMENT SCORES FOR THE 
RELAXED, NEUTRAL, FAILURE, AND 
Success-FAILURE CONDITIONS 


i Success- 
Relaxed | Neutral | Failure R 
Bu 39 <i (Sil ORES 
Mean —1.00 3.13 5.82 6.00 
Im 46 69 82 73 
Diff. 4.13 2.69 18 
dist. 83 1.07 1.14 
Critical ratio 4.98 2.51 .16 
<.02 >.50 


P ¢ <.01 


5 A scoring system was also tried which weighted 
the changes in accordance with the improbability of 
their having occurred by chance. However, the cor- 
relation between weighted and unweighted scores for 
the 39 Ss in the neutral condition was .956, confirming 
the high relationship found in other studies and lead- 
ing to the adoption of the simpler unweighted or unit 
scoring. 


relaxed and failure conditions differ’ very signifi- 
cantly as they should from the way the scoring 
system was devised. The success-failure condi- 
tion was almost exactly equal to the failure con- 
dition as had been indicated by the comparisons 
in Tables 1-4. The neutral condition showed a 
moderate need strength, by this scoring system, 
which was significantly greater than the relaxed 
condition and significantly less than the two fail- 
ure conditions. This last comparison is particu- 
larly important methodologically, because the 
papers from these three groups had been mixed 
together and were all scored together without the 
judges’ knowing to which group any paper be- 
longed. Thus, with all possibility of bias re- 
moved, there is still a significant mean difference 
in n Achievement score between a presumed low 
and high intensity of induced n Achievement. 

Reliability of the Scoring—A matter of con- 
siderable methodological importance, in view of 
the present tendency of experimentalists to 
eschew free verbal reports, is the reliability of the 
scoring system used here. Consequently, reliabil- 
ity was studied intensively from three different 
angles. First, an attempt was made to determine 
the extent to which the judges agreed on a given 
category for a particular story. Since agreement 
is almost certainly a function of amount of the 
judges’ experience with the scoring system, a 
measure of it was taken at the end of the scor- 
ing, after one of the judges had had a year’s pre- 
vious experience with the system amounting to 
the scoring of at least 3,000 stories and the other 
had scored at least 1,000 stories. The two 
judges always worked together, one reading the 
story aloud so that both could form tentative 
judgments independently, which were discussed, 
if they differed, in making the final decision. At 
the time the test was made they were spending 
on the average of two to three min. per story, Or 
at the most from five to ten min. per S. The 
test consisted of drawing 10 records at random 
from the neutral, failure, and success-failure 
groups and rescoring them. The index of agree- 
ment was computed by dividing twice the agree- 
ments by the sum of the items scored on each of 
the two occasions. It turned out to be 291 /321, 
or 91 per cent. 

Secondly, reliability was approached from a 
less conservative viewpoint by attempting to 
measure the extent to which the totals are stable 
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for a given category on two judgments of the 
same records. This is more to the point in esti- 
mating the dependability of group shifts, since 
judges may quite possibly miss a category in one 
story and pick it up somewhere else leaving the 
total the same though not the percentage agree- 
ment. To check this, the stories written by the 
39 relaxed Ss were completely rescored after all 
the other stories had been finished.* There was 
a nearly significant increase in the proportion of 
stories scored as containing achievement imagery, 
due to a conscious liberalization in the judges’ set, 
but this increase did not change the ratio of 
general to task imagery or any of the other cate- 
gories scored. Seventeen out of 22 of the cate- 
gory totals were within three points of each 
other. 

Also relevant to this point is the comparability 
of the totals for various categories for the failure 
and success-failure groups. They are very close 
in nearly every case in Tables 1-4, and the mean 
over-all n Achievement scores are practically iden- 
tical. This shows that category totals are apt to 
be quite stable even when obtained on two dif- 
ferent groups of Ss, and even when there are 
minor differences in the method of arousing 
n Achievement. 

In the third place, the reliability of an indi- 
vidual’s over-all n Achievement score was tested 
by correlating the scores obtained for 30 indi- 
viduals on two different scoring occasions. The 
product moment correlation was .946, indicating 
fairly high stability of an individual’s score for 
his whole record. Furthermore, this correlation 
is probably conservative, since 20 of the 30 Ss 
came from the relaxed group, which reduced the 
range of scores, and since the scoring was done 
much more hastily on both occasions than it nor- 
mally would be in a clinical situation. 


DISCUSSION 


Validity of n Achievement Score—No one 
can deny that there are differences in the story 
characteristics which appear in the relaxed as 
compared with the failure condition, but is it 


6 An attempt was made to mix the relaxed stories 
with others in the rescoring but it was not continued 
beyond the first 10, since the judges who had scored 
these same stories several times before easily recog- 
nized them as being very different from the others 
with which they were mixed. Any further attempt 
to conceal their identity seemed a waste of time. 


proper to assume that these differences represent 
a difference in the need for achievement in the 
two groups? This is the central problem of valid- 
ity, of whether the score derived from these dif- 
ferences measures anything of importance, or 
more particularly of whether it measures the 
n Achievement which it is supposed to measure. 
There are two kinds of evidence which argue that 
it is a valid measure of n Achievement. 

The nature of the procedure used to arouse 
the need provides the first basis for assuming that 
n Achievement was more intense in the two fail- 
ure conditions. In discussing what we have la- 
belled n Achievement, after Murray (15, p. 164), 
Sears states: “There are many names for this 
learned drive: pride, craving for superiority, ego- 
impulse, self-esteem, self-approval, self-assertion, 
but these terms represent different emphases or 
different terminological systems, not fundamen- 
tally different concepts. Common to all is the 
notion that the feeling of success depends on the 
gratification of this drive, and failure results from 
its frustration” (17, p. 236). This suggests that 
the experimental operations which will satiate 
and arouse the drive are success and failure. 
However, the success and failure must be in re- 
lation to some achievement goal which the Ss have 
set for themselves. That is, in the case of a 
physiological need like hunger, it is only neces- 
sary to deprive the Ss of food to arouse the drive, 
since the organism automatically by the con- 
sumption of energy produces in time a need for 
food. But in the case of a psychogenic need it 
is necessary first to induce Ss to want some goal 
like achievement. In the present experiment -that 
was supposedly done by giving the Ss an oppor- 
tunity to perform on some tests which were de- 
scribed to them in such a way that doing well 
should lead them to feel increased pride, self- 
esteem, self-approval, feelings of success, etc. 
Since these terms define what is commonly meant 
by the striving for success or n Achievement, 
if the instructions and the tests were such as 
to arouse these feelings, then by definition 
n Achievement was aroused in the failure and 
success-failure groups.” And it does seem rea- 
sonable to assume that the attainment of high 


7 Since these instructions are also the ones com- 
monly called “ego-involving” by other workers in the 
field (cf. J), it is apparent the authors believe that 
ego-involvement and n Achievement arousal are the 
same thing under certain conditions, 
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intelligence and leadership as suggested in the in- 
structions are two goals which in our society 
would lead to the feelings mentioned. 

Granted that n Achievement was aroused by 
the instructions, it further seems reasonable to 
suppose that failure-frustration would lead to a 
heightened need. Although this assumption is 
supported by experimental evidence (8) and by 
the deprivation method of arousing physiological 
drives, we recognize that it may complicate the 
resulting picture here. That is, granted that 
failure does heighten n Achievement, it may also 
lead to the projection of material which is spe- 
cific to the experience of failure rather than char- 
acteristic of a “pure” heightened need.® It was 
this conviction that led to the collection of stories 
from an ego-involved group which had had 
neither success nor failure. Unfortunately, as 
has been reported under Procedure, for reasons 
given fully elsewhere (cf. 14) the Ss in this group 
were too inhibited to write stories which could 
be readily analyzed. So the main comparisons 
had to be made between a relaxed condition and 
a condition in which n Achievement was aug- 
mented by failure. i 

The comparison with the effect of hunger on 
similar stories (2) provides the second basis for 


8 The fact that the failure group showed more dep- 
rivation themas than the success-failure group, while 
both showed about the same high number of themas 
as compared with the relaxed group, would support 
this proposition. That is, one could argue that 
heightened need tension results in more achievement 
imagery central to the plot, but that failure as a 
method of increasing this tension shifts some of this 
plot or thema imagery to the deprivation category. 


arguing that a need has been aroused by the ex- 
perimental conditions. However unwise it may 
prove to be to have used failure to heighten the 
need intensity, it serves to make the need-arousal 
method more nearly comparable to the depriva- 
tion used to increase hunger (2). Consequently, 
it becomes more legitimate to ask, what is the 
evidence that food-deprivation and achievement- 
deprivation affect imagination in the same way? 

Table 6 provides the positive evidence that the 
two needs have the same general effect. The 
case rests largely on the first three items (D th, 
N and I+), since failure of categories to shift 
may mean failure of the scoring system at some 
point. Even so the evidence is impressive, COn- 
sidering the fact that a complex psychogenic need 
like that for achievement might be supposed on 
a priori grounds to differ extensively from a sim- 
ple primary need like hunger. 

There is also some negative evidence, i.e., in- 
stances of categories which shift in one experi- 
ment but not in the other. But these can rather 
easily be explained in terms of differences in pro- 
cedure in the two experiments. For example, the 
biggest lack of correspondence in the two experi- 
ments was in the way a higher need decreased the 
favorable aspects in the food stories and increased 
them in the achievement stories. A case in point 
is the decrease in friendly press for hunger and 
the increase in nurturant press for n Achieve- 
ment. This can be explained by the fact that the 
two control groups were not equivalent. The 
one-hour hunger group was satiated with respect 
to hunger, whereas the relaxed group in the pres- 
ent experiment could best be described as un- 


TABLE 6 


A COMPARISON OF THE STORY CHARACTERISTICS SHOWING SIGNIFICANT CHANGES 
For BOTH INCREASED n FOOD AND INCREASED n ACHIEVEMENT 


1. An increase in the number of plots dealing primarily with depriva- 
tion of the goal in question (D th) 

2. An increase in the number of times that characters in the stories 
were said to want or wish for the goal in question (N) 

3. An increase in the mention of instrumental activities which are suc- 


cessful in dealing with the need. 


-related problem (I+) 


4. No change in the number of plots dealing with direct attainment of 


the goal (F th or Ach th) 


5. No change in the amount of substitute activity, in instrumental 
activity of unsuccessful or doubtful outcome, or in negative affect 
(represented by subjective hostility in the food experiment) 
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motivated with respect to n Achievement. Satia- 
tion undoubtedly carries over to increase the fre- 
quency of appearance of favorable story aspects, 
as has been shown for the n Achievement situa- 
tion when success is given the Ss (12, 14). Since 
the low need groups therefore doubtless differed 
initially in the amount of favorable material pro- 
jected, it is not surprising that high need groups 
in the two experiments produced different or even 
opposite effects. Other incongruencies between 
the two experiments are largely due to changes 
in the scoring system necessitated by the greater 
complexity of n Achievement (e.g., the general 
imagery category). 

If one notes the major agreements and explains 
away in this manner the disagreements, Table 6 
can be said to supply considerable support for the 
argument that the conditions of this experiment 
induced a state in the Ss which affected their im- 
agination in the same general way as an increase 
in hunger. To the extent that one accepts hunger 
as a need, it would therefore seem valid to refer 
to the state induced by ego-involvement and fail- 
ure as a need. ' Even if one grants this, however, 
it must of course still be shown that the situation- 
ally induced need affects apperception in the same 
way as a strong character need would, as clini- 
cally or otherwise defined. This ultimate prob- 
lem of validity must await further study. 

Clinical Applications —In the meantime, the 
data are sufficiently clearcut to provide some 
guidance for the person working with the TAT 
clinically. They suggest in the first place what 
story characteristics are apt to be important as 
indicators of need strength. Although the valid- 
ity of these indicators is by no means finally 
established, they. do represent an advance over 
the logical or a priori validity earlier workers in 
the field have been forced to assume for their 
scoring systems. In the second place, the data 
suggest to the clinician that the conditions of 
administration of the TAT are of considerable 
importance in determining the dynamic content 
of the stories. Stories written under relaxed, 
neutral, and failure conditions differed so much 
in the present experiment as to suggest more cau- 
tion than has heretofore been indicated in assum- 
ing that the basic personality picture given by 
the TAT is not influenced by recent experiences 
(3, 7). Our results suggest strongly that the 
clinician should be careful to investigate such 
matters as how the subject conceives of the test, 


his reason for taking it, his relation to the tester 
who may or may not have given him other tests 
that have involved success or failure, etc. . 
Nature of Motivation. —One of the most im- 
portant implications of this experiment is sug- 
gested by a consideration of the categories which 
shifted in frequency when the need was presum- 
ably aroused. Most, if not all of them, appear 
to have a future reference—for instance, the 
stated wish for achievement, successful instru- 
mental striving, anticipatory goal responses, and 
positive affect at the end of the story. Two other 
important characteristics—the increase in general 
imagery and the increase in deprivation themas— 
also appear to refer to the future on further 
scrutiny, ‘the former because it is defined as in- 
volving a person’s career or life work, and the 
latter because it is defined as a situation in which 
forces are at work against a person which would 
make him worse off in the future. In both in- 
stances the presence of stated need or anticipa- 
tory goal response was often useful in defining 
the category. On the other hand categories did 
not change which seemed to involve more of an 
objective description of the situation (plots, ob- 
stacles and outcomes) without the striving or 
anticipatory dimension. This, taken with similar 
earlier evidence (74), suggests that it is one of 
the major characteristics of motivation—at least 
achievement motivation—to be anticipatory or 
forward looking.‘ This might seem to be a some- 
what radical departure from the usual concep- 
tion of a motive as a persisting deficit stimulus, 
but oddly enough Hull (9), working from en- 
tirely different data, has come to much the same 
conclusion—namely, that fractional anticipatory 
goal responses are the key to understanding pur- 
poseful and motivational phenomena. In fact, 
one can argue that the anticipatory goal re- 
sponses observed in this experiment supply a 
kind of direct confirmation of Hull’s view which 
has been very difficult to obtain with animals. 
Methodological Considerations—Last but not 
least these results have an important bearing on 
the experimental methodology of handling ver- 
bal material. They report a method for scoring 
written ‘thematic apperception stories which is 
sensitive enough to distinguish between the con- 
ditions under which the stories are written, which 
is objective enough to yield high agreement on a 
repeat scoring by two trained judges working to- 
gether, and which is easy enough to apply quickly 
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to an individual record. This in itself is of con- 
siderable importance in a field in which prior 
scoring systems have either been so complex or 
so dependent on clinical insight (3, 15, 20) that 
they are of little use to the experimental psy- 
chologist interested in studying imaginative proc- 
esses. 

The potential value to psychological theory of 
an objective scoring method for free verbal be- 
havior is illustrated by the fact that its applica- 
tion in this experiment clearly indicates that 
phantasy does not always serve the purpose of 
wish-fulfillment or substitute gratification for 
pleasures denied in reality, an assumption which 
has been rather frequently made (cf. 10, p. 93). 
Instead, a study of the variety of story charac- 
teristics which shifted in this experiment with an 
increase in need supports the parsimonious as- 
sumption that imaginative behavior is governed 
by the same general principles as govern any be- 
havior. For example, a variety of experiments 
show the same increase in instrumental activity 
with increased drive at the gross motor level; 
others, as in the standard Pavlovian conditioning, 
show the same increase in anticipatory goal re- 
sponses (salivation). If one grants that the prin- 
ciples governing imaginative behavior are no dif- 
ferent from those governing performance when 
both are analyzed according to the same cate- 
gories of response, then the method used here 
becomes a more subtle and flexible approach to 
the establishment and extension of those princi- 
ples than the ordinary method of studying per- 
formance. Thus, for example, it would be diffi- 
cult to get a performance response which would 
correspond to the anticipation of deprivation 
which follows drive arousal at the imaginative 
level. One might even go so far as to suggest 
that by the use of this method Tolman could 
study much more directly the “cognitive maps” 
which the behavior of his rats has led him to 
infer are the important intervening variables in 
determining behavior (19). 


SUMMARY 


Over 200 male college students wrote five-min. 
stories in response to four slides depicting 
achievement-related situations under the influ- 
ence of various interpretations of the meaning of 
the story writing and several short pencil and 
paper tests taken just previously. The stories 
were analyzed completely for 39 Ss from each 


of four conditions: (1) a relaxed condition, in 
which all the tests were interpreted as being in 
an experimental stage, (2) a neutral condition, in 
which the tests were described as experimental 
but in which the Ss were urged to do their best 
to establish some norms, (3) a failure condition, 
in which the tests were interpreted as standard- 
ized measures of intelligence and leadership and 
in which the Ss wrote their stories after failing 
on the paper and pencil tests, and (4) a success- 
failure condition, which was the same as the fail- 


re 
ure condition except that the Ss succeeded on 


the first part of the paper and pencil tests and 
then failed on the whole test. The stories from 
a group who wrote under ego-involving instruc- 
tions but without success and failure proved too 
inhibited to analyze, and those from a group who 
succeeded throughout are not reported because 
the meaning of the situation to the Ss did not 
seem clear. The scoring followed in general the 
usual analysis of an overt behavioral sequence 
with adaptations from Murray. The following 
results were obtained: 

1. The scoring method, when used by two 
experienced judges working together, could be 
quickly applied (two to four min. per story), 
was sensitive enough to discriminate among the 
stories written under different conditions even 
when mixed together before judging, and was 
objective enough to yield on rescoring a 91 per 
cent agreement for individual categories and a 
rescoring reliability coefficient for the n Achieve- 
ment score developed of .948. 

2. On the assumption that the relaxed and 
failure conditions represented a low and high 
degree of induced need for achievement, a com- 
parison was made of the category shifts between 
these two groups. The following changes OC- 
curred at least at the .05 level of significance: a 
decrease in unrelated and task achievement imag- 
ery, an increase in general achievement imagery, 
achievement-related deprivation themas, stated 


. needs, successful instrumental acts, anticipatory 


goal responses, nurturant or hostile press, and 
positive affective states. In nearly every case the 
success-failure condition showed the same per- 
centages as the failure condition providing a 
category total stability check. 

3. A single n Achievement score was com- 
puted for each individual by summing the char- 
acteristics he showed which increased reliably for 
the group and subtracting those which decreased 
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reliably. The mean n Achievement scores com- 
puted in this way increased significantly in ac- 
cordance with the presumed increase in induced 
need from relaxed, to neutral, to the failure 
conditions. 

The validity of these results as true measures 
of n Achievement is discussed particularly as it 
derives from a comparison with similar trends 
obtained with hunger and from a consideration 
of the experimental operations performed on the 
Ss. The data are further interpreted as pointing 
to the dynamics of the test situation as an impor- 
tant determiner of TAT content, as supporting a 
theory of motivation based on anticipatory goal 
responses, and as providing a method for in- 
vestigating such important theoretical constructs 
as “cognitive maps” and “anticipatory goal re- 
sponses” which is more sensitive than that based 
on the usual inferences from performance re- 
sponses. 
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THEMATIC APPERCEPTION TEST: 
SOME EVIDENCE BEARING ON 
THE “HERO ASSUMPTION” * 
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From its very moment of origin, the Thematic 
Apperception Test has been intimately associated 
in the minds of most users with the assumption 
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that there are certain characters in each story 
that clearly reflect attributes of the storyteller, 
while other figures are more revealing of the 
storyteller’s perceptions of the individuals who 
populate his personal world. In an empirical 
discipline, however, the length of time that a 
belief has been held entitles it to no special con- 
sideration, and it is therefore quite natural that 
this assumption should be challenged. 

Objections to the hero assumption were par- 
ticularly prone to occur because of the intimate 
relation between an individual's own attributes 
and that which he perceives in the outer world, 
Thus, it is commonly assumed, with considerable 
supporting evidence, that an individual who is 
highly aggressive will perceive more hostility in 
the world around him than an individual who is 
less aggressive. This association between internal 
and external tends to disrupt and blur the easy 
distinction implied by the original assumption. 
It should be mentioned that Murray, who initially 
formulated this assumption, was fully aware of 
the. complexity of the relation between the indi- 
vidual’s inner and outer worlds and never in- 
tended that the test interpreter should slavishly 
maintain a rigid distinction between “hero” and 
“other.” In fact, the manual originally published 
with the test (72) contains a lengthy discussion 
of various complications that under special cir- 
cumstances make it necessary to modify or aban- 
don the assumption of a single hero or identifica- 
tion figure. \ 

Even accepting the fact that ultimate precision 
will surely depend upon some more complicated 
set of assumptions than those we are here con- 
cerned with, it is still an interesting question 
whether the interpreter of the TAT can go further 
with the assumption of a distinction between hero 
and non-hero than he can with the assumption 
that all figures in the story are equally revealing 
of the “own characteristics” of the storyteller. 
Although endless rational arguments can be in- 
troduced bearing on the choice between these 
assumptions, what we need most is not logical 
inference nor emotional polemic but controlled 
empirical evidence. In this spirit, the present 
paper is intended to outline the results of two 
small investigations designed to provide some 
very tentative findings that bear upon the utility 
of employing the distinction between hero figures 


and non-hero figures in interpreting Thematic 
Apperception Test protocols. 

In the first study, we began with the very 
simple notion that if “heroes” were more indica- 
tive of characteristics of the subject (S) than 
were “other figures,” the S should perceive them 
or react to them differentially. Given this rea- 
soning, we proceeded to administer the TAT to 
a number of Ss, conducted an individual inquiry 
in which we tried to assess the reaction of the 
storyteller to each of the figures in his stories, 
and then looked for differences in reaction to 
those figures independently rated as identification 
figures as opposed to those that were not rated 
as identification figures. 

In the second study, we were able to examine 
certain quantitative shifts that occurred in TAT 
stories following a frustration experience within 
the framework of the hero assumption and within 
the framework of the assumption that all figures 
are equally indicative of characteristics of the 
storyteller. 

It should be clearly understood that in these 
studies we are not asking whether the simple 
assumption of a single hero in each story is the 
most fruitful assumption that can be made, What 
we are asking is whether this is a more fruitful 
assumption than the opposite extreme, the as- 
sumption that all figures in the story are equally 
revealing of the storyteller’s characteristics. 
There are many fine gradations between these 
extremes, and a shrewd observer with a facile 
pen can complicate the assumptions endlessly. 
Some of the many alternative assumptions have 
been considered in an earlier paper by one of 
the present authors (6), and his presentation has 
been examined critically by Piotrowski in several 
subsequent papers (13, 14). What is needed 
now, however, is not further complexity or ra- 
tional elaboration but rather a statement of as- 
sumptions with sufficient explicitness so that they 
lead to clear empirical consequences, followed 
by a careful testing of these consequences. 

The findings presented here are, at most, a 
beginning on the road to clarifying the kinds of 
underlying processes that operate in the construc- 
tion of imaginative stories. Their sole virtue lies 
in the fact that they make clear under controlled 
circumstances something about the relative merit 
of two widely divergent and yet defensible as- 
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sumptions concerning the process of interpreting 
imaginative protocols. 


SUBJECT REACTIONS TO HERO AND NON-HERO 
FIGURES 


If the assumption of a hero in each story whose 
attributes are especially revealing of the S’s psy- 
chological makeup is to prove defensible, it 
seemed to us that we should be able to show that 
the S reacted differently to hero figures than he 
did to those figures judged not to be hero figures. 
In particular, we reasoned that if heroes were 
carriers of storyteller attributes, the S should see 
these figures as more similar to himself than the 
non-hero figures, or else he should react with a 
violent denial of any similarity or resemblance 
between himself and the hero figures. This con- 
clusion was derived from the assumption that the 
TAT revealed both conscious attributes and un- 
conscious or unacceptable attributes of the S, 
coupled with the further reasoning that when a 
figure represented a conscious quality of the 
storyteller, he would accept and report the sim- 
ilarity, while under circumstances where there 
was an unacceptable impulse or quality involved, 
he would tend to deny strongly any similarity be- 
tween himself and the figure. 

Consequently, the hypothesis to be tested in 
this study asserts that story-characters independ- 
ently judged to be hero figures are seen by the S 
as similar to himself or as having no self-similar- 
ity whatsoever, while figures independently judged 
not to be hero figures are more often seen by 
the storyteller as resembling others or else as 
representing stereotyped or fictional characters. It 
seems clear that the assumption that all figures 
are equally revealing of the storyteller provides 
no basis for predicting any difference in $ reac- 
tion to hero and non-hero figures. 


PROCEDURE 


A shortened version of the TAT consisting of 
Cards 2, 5, 7GF, 9GF, 10, and 18GF was ad- 
ministered in a small group setting to 30 Syracuse 
University undergraduate females. Use of group 
administration seemed warranted in view of the 
results of an earlier investigation (7) comparing 
individual and group administration. The Ss had 
volunteered to participate in the study from an 
introductory course in psychology. After com- 
pletion of the group test, individual appointments 


were made for each $ within 48 hours from the 
time of the original test. During the individual 
interview, each S was asked to tell as much as she 
could concerning the factors that led to her creat- 
ing each of the stories she had constructed. In 
addition, for each character in each of the stories, 
she was required to make a judgment of how sim- 
ilar the character was to herself and how similar 
it was to other people whom she had known. 
The responses of the Ss permitted us to categorize 
each character in terms of similarity to self 
(thought of as self, could be self, some similarity 
to self, denial of similarity to self) and similarity 
to other (thought of as other, similar to other). 
The characters were also classified in a general 
stereotype category when the S reported that the 
figure represented some general fictional character 
or when she reported no resemblance to anyone 
and could not say what had influenced her to 
create this character. 

The stories were independently analyzed in 
order to identify the hero and non-hero figures 
in each story. Two raters with no shared prac- 
tice and with only a few general scoring principles 
were able to reach complete agreement on the 
identity of the hero figure (or the absence of a 
hero) in 90 per cent of the 180 stories rated. 

The analysis of these data posed certain thorny 
problems as the most obvious and pertinent ar- 
rangements of the data led to more observations 
than subjects and thus made customary statistical 
analysis inappropriate. In view of this difficulty, 
we followed the convention of presenting the 
data descriptively in what seems to us the most 
revealing manner and then, in addition, perform- 
ing certain further analyses that permit the ap- 
plication of the usual tests of significance. 

In order to reduce the number of observations 
to the number of Ss, we adopted a procedure for 
each of the three relevant areas of response (sim- 
ilarity to self rather than other, denial of similar- 
ity to self, stereotype) that permitted us to assign 
to each S a score representing the extent to which 
his six stories fitted with or deviated from the 
predicted pattern. Thus, if the first story that 
the individual told found him identifying the 
judged hero of the story as resembling himself, 
while the non-hero figures were judged to be 
similar to others, a score of one would be as- 
signed, If a non-hero figure was judged to be 
similar to the self, while the hero figure was not 
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judged to be similar to the self, a score of minus 
one was assigned. If the hero and non-hero 
figures were treated in identical fashion, a score 
of zero was entered. The total score for each S$ 
consisted of the sum of the individual scores for 
his six stories and could theoretically vary from 
plus six (confirmation of the hypothesis in every 
story) to minus six (negation of the hypothesis 
in every story). This procedure provided three 
sets of scores representing the extent to which 
the stories and inquiry responses of the individual 
Ss conformed to our predictions in regard to sim- 
ilarity to self versus other, denial of similarity to 
self, and stereotyped response. 


RESULTS 
The general results of the study are summarized 


in Table 1, where we find the distribution of hero 
TABLE 1 


Supyect REACTIONS TO HERO 
AND Non-HERO FIGURES 


Frequencies 
Hero Non-Hero 
Reactions Total 
Ob- 
served 

Self 3 6 
Could be self 14 18 
Similar to self 21 27 
Denial of self 29 59 
Other 9 27 
Could be other 33 88 
Stereotype 18 236 

Total 187 461 


and non-hero figures in each of the categories 
having to do with similarity to self and to other 
as well as the stereotype category. A comparison 
of the frequencies actually obtained with those 
that would be expected by chance alone makes 
it evident that the trend of these data is in sup- 
port of the predictions made in advance of the 
study. Figures independently rated as heroes 
tend to be perceived as more similar to the story- 
teller than non-hero figures, or else are denied 
any similarity to the storyteller. On the other 
hand, non-hero figures, when compared to hero 
figures, tend more often to be identified as similar 
to some person other than the storyteller or else 


are classified as stereotyped. If the categories 
implying various degrees of similarity to self are 
combined into a single category and the same 
operation is performed for categories representing 
similarity to others, it is then possible to examine 
the association between the hero and non-hero 
distinction and similarity to self and other in a 
single 2 X 2 table. In Table 2, the outcome of 


TABLE 2 


SIMILARITY TO SELF AND OTHER OF HERO 
AND NON-HERO FIGURES 


Hero Non-Hero 
Similar to self 38 13 
Similar to other 42 73 


such a procedure is represented, and there is 
clear evidence for association between the hero 
designation and perceived similarity to the self. 
It is, of course, not legitimate to perform the 
usual statistical analyses because of the lack of 
independence of observations. However, any 
such test applied to this array of findings would 
indicate the predicted association at a highly 
significant level. In general, these findings hold 
true not only for the over-all distribution reported 
in Table 1, but they are also sustained when the 
same distribution for each of the six cards is 
examined separately. 

The results of our statistical comparison of 
hero and non-hero figures are presented in 
Table 3, where it is again made clear that the 


TABLE 3 


CONFIRMATION OF PREDICTIONS CONCERNING 
DIFFERENCES IN REACTION TO HERO AND 
Non-HERO FIGURES 


Reaction X a 


Hero similarity to self 
and non-hero similar- 


ity to other 966% | 343 
Denial of similarity to 

self 343 186 
Stereotype 1.033 | .261 


“4 A positive deviation from zero indicates confirma- 
tion of the predicted relation. 


data tend to support our hypothesis. When 
heroes and non-heroes are compared in regard 
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to perceived similarity to self and others, the 
scores present a significant deviation from chance 
in favor of the predicted greater similarity of self 
to hero figures and of others to figures judged 
to be non-heroes. The stories of 18 of our Ss 
revealed the predicted association between self- 
similarity and hero figures and other-similarity 
and non-hero figures; for four of the Ss, the hero 
and non-hero figures were treated in similar 
fashion, while the remaining eight Ss reversed 
the prediction. Examination of the tendency to 
deny similarity to self reveals confirmation of our 
prediction that this would occur more frequently 
with hero than with non-hero figures. A number 
of Ss provided no evidence of denial in any of 
their responses, so there was no possibility of 
differential perception of the hero and non-hero 
figures; but in those cases where there was such 
evidence, 12 of 17 Ss showed a tendency to react 
with denial to hero figures more often than to 
non-hero figures. We also found evidence that 
non-hero figures were more often categorized as 
stereotyped or unrelated to either self or to other 
persons than were hero figures. There were nine 
Ss who revealed no difference in the incidence of 
hero and non-hero figures who were perceived 
as stereotyped; but of the remainder, 17 reported 
the non-hero figures as more stereotyped, and only 
four saw the hero figures as more stereotyped. 
The above findings provide clear confirmation 
of our predictions and thus support the value of 
the hero assumption. These same results, how- 
ever, dramatically underline the shortcomings of 
this assumption in the face of certain TAT stories. 
Although the general trend of the data fits with 
the derivation from the hero assumption, there 
are individual Ss who consistently reverse the 
prediction; e.g., there are some Ss who character- 
istically view non-hero figures as like themselves 
and hero figures as like others. Thus, it seems 
clear that the actual situation is more complex 
than a literal application of the hero assumption 
would imply. Consequently, it becomes an im- 
portant investigative task for the future to dis- 
cover something about the types of Ss or stories 
or both where this assumption may be applied 
fruitfully, as well as those where some other as- 
sumption should be utilized. This same conclu- 
sion is supported by the general findings con- 
tained in Table 1. While these results support 


our prediction, it is a group trend that we observe 
with many individual exceptions. 

A very brief summary of our Ss’ reports con- 
cerning what had provided them with the idea 
for their stories is contained in Table 4. The 


TABLE 4 
REPORTED SOURCE or TAT STORIES 


Frequency of 


Occurrence 
Source 
n | % 
Imagination 872] 48 
Properties of card 66 37 


Autobiographical event | 37 21 
Experience of others 30 17 


Reading: fiction 28 16 
Movies, TV 24 13 
Reading: non-fiction 8 4 


* Total number of stories: 180, 


results summarized here make clear that in a 
large number of cases (48 per cent) the Ss indi- 
cate only that the story came from their imagina- 
tion. Next most frequently mentioned as a de- 
terminant is the picture or some specific element 
within it (37 per cent). In those cases where 
the Ss are able to identify a specific experience as 
having suggested the story, this was less likely to 
have been a fictional encounter (movies 13 per 
cent, novels 16 per cent) than an autobiographical 
event (21 per cent) or general experience and 
observation (17 per cent). Only a very small 
number (4 per cent) of the stories were reported 
to have stemmed from nonfiction reading, 

In summary, the general findings of this study 
provide modest support for the hero assumption, 
although they also suggest that in individual cases 
the data do not mesh smoothly with this assump- 
tion. Let us turn now to the second study, 


CHANGES FOLLOWING FRUSTRATION VIEWED FROM 
THE VANTAGE OF THE HERO ASSUMPTION 


As we have already agreed, one of the major 
difficulties in evaluating the general utility of the 
hero assumption is posed by the close relation 
between internal states of the S and his perception 
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of external reality. It is not easy to conceive of 
circumstances that on a priori grounds seem likely 
to produce changes in the motivational state of 
the S which will not be mirrored in changes in 
his perception of external reality. What we need 
is to construct a set of conditions under which 
the hero assumption predicts differential changes 
in hero and non-hero characteristics while the 
other assumption, of course, predicts consistent 
changes for both types of figures. 

Such a combination of conditions seemed to 
exist in connection with a set of data that was 
part of an earlier study (5). These data con- 
sisted of a set of TAT protocols collected before 
and after a frustration experience together with 
an appropriate set of control protocols. Most 
important, a great deal was known about the 
details of the frustration experience and the Ss’ 
reactions to this situation. This information per- 
mitted us to make specific predictions concerning 
the changes that could be expected if one assumed 
all figures to be equally representative of the 
storyteller or if one assumed that there was only 
one figure, the hero, that reflected personal char- 
acteristics of the storyteller. 

In brief, the frustration situation was of such 
a nature that it seemed plausible to expect that 
the S’s hostility toward others would increase, 
that he would also perceive others as directing 
hostility or aggression toward him, and finally 
that he would feel guilty or direct aggression 
toward himself. All acts of aggression within 
the TAT stories were analyzed in terms of whether 
the aggression was directed from (a) hero against 
other, (b) hero against self, (c) other against 
hero, or (d) other against other. 

The assumption that all figures were equally 
characteristic of the storyteller implied that there 
should be a significant increase in aggressive acts 
of all four types in view of our prior information 
concerning the increased aggressive tendencies on 
the part of the storyteller. This assumption 
clearly suggests that if the S was more aggres- 
sive, this tendency should be revealed evenly or 
equally in all figures within the story. On the 
other hand, the hero assumption implied that 
there should be an increase in only three of the 
four categories. Aggressive acts carried out by 
the hero against others should increase as a result 
of the storyteller’s increased extrapunitive tend- 
encies. There should also be an increase of ag- 


gressive acts carried out by the hero against him- 
self in view of the increased guilt or intrapunitive- 
ness of the storyteller. Finally, there should be 
an increase in aggressive acts carried out by others 
against the hero as a result of his perception of 
the other members of the group as hostile toward 
him. There was nothing in our analysis of the 
frustration situation to suggest that the story- 
teller saw the persons around him as being hos- 
tile or aggressive toward each other so we would 
not predict any change in hostility between non- 
hero figures. Consequently, we find that the two 
assumptions agree in predicting significant changes 
in three of the four categories but are differen- 
tiated in their predictions concerning the fourth 
category. 

The reasoning above is readily defensible on 
rational grounds, The important feature of this 
derivation, however, is not its invulnerability to 
logical assault but rather the fact that it was 
executed prior to analysis of any data except that 
having to do with a category where no difference 
was predicted (aggressive acts carried out. by 
heroes against others). In other words, before 
the fact, the two assumptions, coupled with our 
detailed knowledge of the frustration situation, 
seemed to lead to a differentiated prediction which 
we set out to test. After the fact, there is little 
doubt that with reasonable motivation and in- 
genuity either assumption could be rationalized 
by a sophisticated observer with these or almost 
any other set of empirical findings. 


PROCEDURE 


The Ss in this investigation were 40 male under- 
graduate students of Harvard University who had 
been selected so that on the Allport-Kramer 
Prejudice Scale (7) they fell at the extremes of 
a group of 575 students enrolled in an under- 
graduate psychology course. This division of the 
Ss into high and low prejudice groups is of no 
interest in the present inquiry, and our current 
analysis overlooks the dimension of prejudice 
except for the fact that experimental and control 
Ss were individually matched in terms of preju- 
dice score. The 20 control Ss were also matched 
with the experimental Ss in age. The Ss were 
told merely that they were participating in a study 
of personality structure and development. At 
the very outset of the study, each S was admin- 
istered individually a shortened version of the 
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Thematic Apperception Test consisting of Cards 
3BM, 8BM, 16, and 20. At the end of approxi- 
mately two months, the 20 experimental Ss were 
exposed to an experimentally contrived frustra- 
tion situation, and immediately following this, 
they were again given the TAT. The instructions 
were to make no effort to recall their original 
story, but if they thought of it first, to put it 
aside and tell the next story that came to mind. 
The control Ss were given the TAT under the 
same conditions except that there was no inter- 
vening frustration experience. 

The frustration experience has been fully de- 
scribed elsewhere (9), and it is necessary here 
only to point out that the experience was care- 
fully divorced from the administration of the TAT 
and that the S was exposed to multiple frustration 
involving both psychological and complex social 
motives. The frustration of the latter motives 
was effected in a group experiment conducted 
with four confederates, two male and two female, 
who were ostensibly fellow participants in the 
study. By disguised manipulation, the experi- 
mental S was made to fail repeatedly on the group 
task, which was presumably related to intelligence; 
he thereby failed to achieve a sizeable financial 
reward offered for successful performance, and 
he also kept the other members of the group from 
winning financial rewards. At various times be- 
ginning immediately after the frustration situa- 
tion, detailed subjective reports were collected, 
and these, in addition to objective observations 
made during the conduct of the experiment, en- 
abled us to describe quite accurately just how the 
Ss experienced or reacted to this situation. 

The TAT protocols were scored simply by 
counting aggressive acts and then coding them 
in terms of whether or not they were carried out 
by the hero of the story and whether or not they 
were directed toward the hero. The effects of 
the frustration situation were measured by means 
of subtracting for each experimental S the num- 
ber of aggressive acts within each category before 
frustration from the number of such acts after 
frustration. From this difference score, we then 
subtracted the difference between the first and 
second administrations of the test for the 
matched control S. In other words, the scores 
that we are concerned with represent the differ- 
ence between the first and second test adminis- 


trations for the experimental Ss, corrected by the 
equivalent shifts shown by the control Ss. 
RESULTS 


The results of our analysis are summarized in 
Table 5. Utilizing either assumption, we pre- 


TABLE 5 


CHANGES FOLLOWING FRUSTRATION IN 
INCIDENCE OF VARIOUS TYPES OF 
AGGRESSIVE ACTS 


Shift Hero Hero Other | Other 
in Against | Against |Against| Against 
Score Other Self Hero Other 
Increase 
4 XXX 
3 x x XXX x 
2 XXXXXX | XXX XXXXX 
1 XXXXXX | XXXX XXXXX | XXXXXX 
0 XXX XXXXXXXX | XXX XXXXXKX 
Decrease 
—1 XXX XXX x XXX 
-2 x XXX 
3 x 
X 75 40 1.75 0 
OF 33 .28 29 _ 
t 2.27 1.43 6.89 — 
p <.02 <.10 <.005 —_ 


dicted that aggressive acts by the hero against 
others, by hero against self, and by others against 
hero, would increase following frustration. There 
is confirmatory evidence for all of these predic- 
tions. The shift is in the predicted direction in 
all three cases and is statistically significant at the 
conventional .05 level for acts involving the hero 
against others, and others against hero. It is just 
short of this significance level for acts involving 
the hero against self. 

When we turn to the fourth column of Table 5, 
we find the data bearing upon the differential 
predictions derived from the two assumptions. 
The hero assumption predicted no change, where- 
as the other assumption predicted a shift similar 
to that observed for the categories just discussed. 
The results are surprisingly definitive as the dis- 


68 CONTEMPORARY RESEARCH IN PERSONALITY 


tribution of change scores has a mean of exactly 
zero; there is no evidence whatsoever of any 
shift in this category. In other words, the data 
we have reported provide strong evidence for 
the superior predictive efficiency of the hero as- 
sumption under the single circumstance where the 
two assumptions differ in their consequences. 


DISCUSSION 


It is clear that the results of these two studies 
provide some warrant for the continued use of 
the hero assumption. Our findings suggest that 
under two circumstances the derivations from 
this crude assumption fit the observed data better 
than the derivations that can be made from the 
easy assumption that all figures in the story are 
equally revealing of storyteller characteristics, 
Having agreed to this, however, we must hasten 
to emphasize the importance of research and 
formulation that will lead us to a more elaborate 
statement of the hero assumption so that under 
known conditions we can apply the kind of com- 
plexity in analysis that our findings, as well as 
the convictions of most clinicians, imply is neces- 
sary in order to derive consistently sensitive and 
accurate inferences from the instrument. 

There is no discussion of the alternatives to the 
hero assumption nearly so illuminating as Mur- 
ray’s original analysis of the hero distinction 
where we find the following special cases pro- 
posed: 

(1) The identification of subject with char- 
acter sometimes shifts during the course of the 
story; there is a sequence of heroes (first, second, 
third, etc.). (2) Two forces of the subject’s per- 
sonality may be represented by two different 
characters, for example, an antisocial drive by a 
criminal and conscience by a law-enforcing agent. 
Here we would speak of an endopsychic thema 
(internal dramatic situation) with two component 
heroes. (3) The subject may tell a story that 
contains a story, such as one in which the hero 
observes or hears about events in which another 
character (for whom he feels some sympathy) is 
leadingly involved. Here we would speak of a 
primary and a secondary hero. Then (4), the 
subject may identify with a character of the op- 
posite sex and express a part of his personality 
just as well in this fashion. (In a man this is 
commonly a sign of a high feminine component 
and in a woman of a high masculine component.) 


Finally, there may be no discernible single hero; 
either (5) heroship is divided among a number 
of equally significant, equally differentiated par- 
tial heroes (e.g., a group of people); or (6) the 
chief character (hero in the literary sense) ob- 
viously belongs to the subject-object situation; he 
is not a component of the storyteller’s personality 
but an element of his environment. The subject, 
in other words, has not identified with the prin- 
cipal character to the slightest extent but has 
observed him as he would a stranger or disliked 
person with whom he had to deal. The subject 
himself is not represented, or is represented by a 
minor character (hero in our sense) (72, p. 7). 

Having tentatively identified such special cases 
is an important contribution, but it is equally 
essential to provide a careful specification of how 
one goes about identifying actual stories and 
figures that should be interpreted in the light of 
each of these cases. This, it seems to us, is a 
promising and largely untouched area of re- 
search. 

There are several tacks which such investiga- 
tion might follow. First, it is possible that one 
might be able to discover within-story cues that 
would be helpful in deciding whether to apply 
the simple hero assumption or some more com- 
plex version. Second, there is the ever-present 
likelihood that knowledge of which assumption 
would be most fruitful will depend upon further 
information concerning the S himself. Thus, the 
appropriate assumption might vary with the 
cognitive style, character type, cultural back- 
ground, or intellectual level of the $. Third, de- 
pending upon the situational context in which 
the test is administered, the process of story 
creation may vary so that different assumptions 
are warranted. When the test is given in a 
threatening assessment situation, we might find a 
different interpretive assumption warranted than 
when the test is given in a permissive clinical 
setting, where the S is voluntarily seeking assist- 
ance. There are, of course, many other types of 
questions that might be asked concerning this 
aspect of the interpretive process. However, if 
we knew something about the relation between 
the variants of the hero assumption and variation 
in the nature of the story, in the characteristics 
of the S, and in the situational context, we would 
be tremendously advanced over our present posi- 
tion. 
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Granted that further research is indicated, do 
our findings in their present state provide us with 
useful information? They do! First, as we have 
indicated, they give modest support for those 
clinicians and investigators who have habitually 
employed the hero assumption in their use of the 
TAT. Second, they provide negative evidence 
for the various persons working with the instru- 
ment who have attempted to eliminate completely 
the hero assumption in favor of the other alterna- 
tive we have considered. One may object to these 
inferences on the grounds that our findings are 
by no means definitive. Indeed they are not! 
But in the absence of definitive findings, one uses 
the best evidence one can find, and it seems to us 
that the present studies fit this specification even 
if only by default. 

There remains the interesting question whether 
our findings have any implications for the re- 
search carried out by the many investigators in- 
terested in the quantitative study of the TAT who 
have not employed the hero assumption. What 
of investigations such as those by Eron (2), 
Hartman (3), Henry (4), and McClelland et al. 
(11), where there is typically no distinction be- 
tween hero and other? In spite of the highly 
tentative nature of our findings, they do seem to 
imply that these investigators might have demon- 
strated somewhat greater test sensitivity if they 
had recognized the difference between character- 
istics displayed by hero and by other figures in 
their efforts to relate test attributes to independent 
measures. This same possibility exists in connec- 
tion with some of our own research (8, 70) and 
poses an intriguing problem for further investiga- 
tion, 


SUMMARY 


We have conducted two studies designed to test 
the comparative effectiveness of the assumption 
that there is customarily a single figure in TAT 
stories which is particularly revealing of the 
storyteller’s own attributes as opposed to the as- 
sumption that all figures in the stories are equally 
revealing of the subject's characteristics. In the 
first study, 30 female undergraduate subjects 
were asked to judge the similarity between them- 
selves and the figures in TAT stories they had 
created. They also reported on the similarity be- 
tween these figures and other persons they had 
known, and described what had led them to tell 


each story. The results of the study confirmed 
our prediction derived from the hero assumption 
that hero figures would more often be identified 
as similar to self or else denied any similarity to 
self, while non-hero figures would more often be 
identified as similar to other persons or else would 
represent general stereotypes. 

The second study examined changes in TAT 
protocols following a frustration experience. Ag- 
gressive acts carried out by heroes against others 
and against the self, and also aggressive acts car- 
ried out by others against the hero, all increased 
following frustration. There was no change in 
the incidence of aggressive acts carried out by - 
others against others. The hero assumption had 
predicted just this pattern of results, while the 
alternative assumption incorrectly predicted con- 
sistent increases in all four types of aggressive 
acts. 

Thus, the results of both studies provide evi- 
dence supporting the utility of the conventional 
assumption of a hero in each TAT story. The 
findings also suggest, however, that under certain 
conditions a more complex set of assumptions 
may be desirable or necessary. 


REFERENCES 


1. Allport, G. W., and Kramer, B. M. Some 
roots of prejudice. J. Psychol., 1946, 22, 9- 
39. 

2. Eron, L. D. A normative study of the The- 
matic Apperception Test. Psychol. Monogr., 
1950, 64, No. 9 (Whole No. 315). 

3. Hartman, A. A. An experimental examina- 
tion of the Thematic Apperception Technique 
in clinical diagnosis. Psychol. Monogr., 
1949, 63, No. 8 (Whole No. 303). 

4. Henry, W. E. The Thematic Apperception 
Technique in the study of culture-personality 
relations. Genet. Psychol. Monogr., 1947, 
35, 3-135. 

5, Lindzey, G. An experimental examination 
of the scapegoat theory of prejudice. J. ab- 
norm. soc. Psychol., 1950, 45, 296-309. 

6. Lindzey, G. Thematic Apperception Test: 
Interpretive assumptions and related empirical 
evidence. Psychol. Bull., 1952, 49, 1-25. 

7. Lindzey, G., and Heinemann, Shirley H. 
Thematic Apperception Test: Individual and ` 
group administration. J. Pers., 1955, 24, 34— 
39; 

8. Lindzey, G., and Newburg, A. S. Thematic 
Apperception Test: A tentative appraisal of 


70 CONTEMPORARY RESEARCH IN PERSONALITY 


some “signs” of anxiety. J. consult. Psychol., 
1954, 18, 389-395. 

9. Lindzey, G., and Reicken, H. W. Inducing 
frustration in adult subjects. J. consult. Psy- 
chol., 1951, 15, 18-23. 

10. Lindzey, G., and Tejessy, Charlotte. The- 
matic Apperception Test: Indices of aggres- 
sion in relation to measures of overt and 
covert behavior. Amer. J. Orthopsychiat., 
1956, 26, 567-576. 

11. McClelland, D., Atkinson, J. W., Clark, R. A., 
and Lowell, E. L. The achievement motive. 
New York: Appleton-Century-Crofts, 1953. 

12. Murray, H. A. Thematic Apperception Test 
Manual. Cambridge: Harvard Univer. Press, 
1943. 

13. Piotrowski, Z. A. The Thematic Appercep- 
tion Test of a schizophrenic interpreted ac- 
cording to new rules. Psychoanal. Rev., 
1952, 39, 230-251. 

14. Piotrowski, Z. A. TAT newsletter. 
Tech., 1952, 16, 512-514. 


J. proj. 


SYSTEMATIC CHANGES IN 
WORD ASSOCIATION NORMS: 
1910-1952 * 


JAMES J. JENKINS AND 
WALLACE A. RUSSELL + 


A norm may be defined as a standard and a 
standard, as everyone knows, is an unchanging 
rule to which appeal may be made for precise 
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measurement. Perhaps it is this verbal legerde- 
main which leads psychologists to overlook the 
fact that norms which are based on consensus 
or popularity of responses are especially sus- 
ceptible to change and require frequent checking 
in our rapidly changing society. 

Word association norms, an example par excel- 
lence of the consensus norms, have been com- 
monly assumed to be stable and highly consistent 
from one time to another. Indeed, investigators 
rarely raise the issue of the possibility of norma- 
tive changes but concentrate their attention on 
differences between groups already known to be 
disparate. Thus, in addition to the very numer- 
ous attempts to compare and contrast clinical 
groups, many studies have been directed toward 
comparisons of responses of normal individuals 
known to differ in some respect. Esper (5), for 
example, compared the most frequent responses 
of American and German students to a standard 
list of stimulus words and Rosenzweig (/6) con- 
trasted the responses of American and French 
students; children’s responses have been con- 
trasted with adult’s responses (75, 24); the re- 
sponses of men have been contrasted with those 
of women (2/, 8); and the responses of one pre- 
professional group have been compared with those 
of another (7). 

While these studies have been both interesting 
and informative, it is illustrative of psychologists’ 
blindness to the normative problems resulting 
from social change, that little attention has been 
given to detailed comparisons of even roughly 
matched groups tested in different years widely 
separated in time. As far as we have been able 
to discover only three investigations have men- 
tioned the question. 

O'Connor (74) in the course of an extensive 
item analysis of the Kent-Rosanoff test data he 
had collected on a sample of “male factory work- 
ers” briefly compared the popular responses to 
those of the “mixed adult sample” of Kent and 
Rosanoff (77). However, he did not pursue the 
matter further than remarking: “The common 
response to a stimulus word is, therefore, a sci- 
entific reality in the sense that it is rediscoverable 
by different workers in a new laboratory after a 
lapse of 15 years,” and “the identity of a common 
response is more reproducible than its frequency 
of appearance” (14, pp. 200-201). Recently, 
Tresselt and Leeds (22) compared word asso- 
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ciation responses collected in 1952 on a small 
sample, heterogeneous with respect to residence 
and occupation with “the original norm on the 
Kent-Rosanoff established in 1927” (actually the 
1910 norms). They found a marked increase 
in primary responses which were responses of 
opposites (Dark-Light, Black-White) and coexist- 
ing pairs (King-Queen, Salt-Pepper). Finally, 
Dérken (3) studied 10 high frequency stimulus- 
response pairs from the Kent-Rosanoff list in 
various normative collections and discovered that 
the frequency of these responses seemed to be 
increasing steadily since 1910. 

Apparently, though these writers suggest that 
changes are underway, a detailed study of changes 
in word association responses over time remained 
to be done. 

A few years ago the writers fortuitously dis- 
covered that the 25-year-old. word association 
norms for students at the University of Minne- 
sota did not adequately represent the associative 
habits of the present student population. To 
facilitate the derivation of functional relation- 
ships in another problem (9) new norms were 
gathered for the Kent-Rosanoff Word Associa- 
tion Test (K-R) for students in introductory psy- 
chology at this university. As these norms were 
studied and compared with the earlier Minnesota 
norms collected by Schellenberg (18), it became 
obvious that the changes were not minor, trivial 
adjustments of a few response frequencies (as we 
had first suspected) but represented a fairly major 
and systematic change in the norms. Further 
work suggested that the change was not a pe- 
culiarity of sampling or technique but a phe- 
nomenon of such scope that a comparison with 
more diverse populations was indicated. Accord- 
ingly, other major American normative collec- 
tions were studied. f 

This paper presents first the comparison be- 
tween the early and later Minnesota norms and 
then goes on to report the major findings of the 
comparison of these two sets of norms with the 
other large-scale normative collections we have 
been able to discover. The purpose of the study 
is to attempt to make clear both the magnitude 
and the nature of the changes revealed by the 
comparisons of word association norms and to 
argue that these changes reveal a systematic and 
important trend in responding to free association 
tests, which deserves further careful considera- 


tion by students of verbal behavior, language and 
culture. 


CHARACTERISTICS OF THE FIVE NORMS 


We shall set down at once what is known con- 
cerning date of collection, characteristics of sam- 
ples, and conditions of testing for each of the 
norms. All five studies employed the same stim- 
ulus list and presumably used the same order of 
presentation. 

Norm 1910: Norms collected by Kent and 
Rosanoff (77); published in 1910; collection dates 
not reported. 

Testing: Oral form; individual test. 

Sample: 1,000 “mixed adults.” 

Among these subjects were persons of both 
sexes and of ages ranging from eight years to 
over eighty years, persons following different 
occupations, possessing various degrees of mental 
capacity and education, and living in widely sep- 
arated localities. Many were from Ireland and 
some of these had but recently arrived in this 
country; others were from different parts of Eu- 
rope, but all were able to speak English with at 
least fair fluency. Over two hundred of the sub- 
jects, including a few university professors and 
other highly practiced observers, were profes- 
sional men and women or college students. About 
five hundred were employed in one or another 
of the New York State hospitals for the insane, 
either as nurses and attendants or as workers at 
various trades, the majority of these were per- 
sons of common school education, but the group 
includes also, on the one hand a considerable 
number of high school graduates; and on the 
other hand a few laborers who were almost or 
wholly illiterate. Nearly one hundred and fifty 
of the subjects were boys and girls of high school 
age, pupils of the Ethical Culture School, New 
York City. The remaining subjects form a mis- 
cellaneous group, consisting largely of clerks and 
farmers (pp. 38-39). 

Norm 1925: Norms collected by O’Connor 
(13, 14); published in 1928; collected in 1925, 

Testing: Presumably oral form; individual test; 
no details are given. 

Sample: “1,000 adult male factory workers”; 
no further description of sample. 

. Norm 1927: Norms collected by Schellenberg 
(18); published in 1930; collected in fall, 1927. 
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Testing: Group test, printed form, written an- 
swers; instructions suggest time pressure. Other 
stimulus words were employed following the K-R 
list. 

Sample: 925 students entering the University 
of Minnesota. Seventy-nine per cent were fresh- 
men; 21% advanced students. Fifty-seven per 
cent men; 43% women. 

Norm 1933: Norms collected by Keene (10); 
published in 1951; collected in academic year 
1933-34. 

Testing: Oral form; individual test. Other 
stimulus words were employed following the K-R 
list. 

Sample: Five hundred Stanford students, 276 
men and 224 women... The group included 
individuals from all four undergraduate levels 
and also included some graduate students. They 
were students from the various psychology classes 
and their friends (p. 43). 

Norm 1952: Norms collected by Russell and 
Jenkins (17); published in 1954; collected in aca- 
demic year 1952-53. 

Testing: Group test, printed form, written an- 
swers. Instructions suggest a time pressure. 
(Same instruction with regard to time as Norm 
1927.) 

Sample: 1,008 students in introductory psychol- 
ogy at.the University of Minnesota. Largely 
sophomore population. About 60% male, 40% 
female. 


COMPARISON OF THE 1927 AND 1952 
MINNESOTA NORMS 


Since the comparison of old and new Minne- 
sota norms stimulated the, remainder of the study 
and since this is the most readily justified as to 
comparability of subjects (Ss) and method, the 
Minnesota comparison is given first and in detail. 
The 1927 and 1952 norms were compared with 
Tespect to the following characteristics: (a) fre- 
quency of primary, secondary and tertiary re- 
sponses, (b) number of changes in primary, sec- 
ondary, and tertiary responses, (c) relation of the 
strength of the primary frequencies to the same 
stimulus words in the two norm sets, the response 
rank-frequency distribution and the characteris- 
tics of the changed response words. 

Frequency of Primary, Secondary, and Tertiary 
Responses —Table 1 gives the average frequency 
of the primary, secondary, and tertiary responses 


TABLE 1 


AVERAGE FREQUENCY OF THE THREE Most 
POPULAR RESPONSES ON THE 1927 anv 1952 
Minnesota NORMS 


Frequency Percentage 
Re- 
sponse 
Rank 1927 1952 
(N = 925) | (N = 1008) 1927 1952 
Ist 267.5 377.5 28.9 37.5 
2nd 111.9 137.1 12.1 13.6 
3rd 72.8 81.2 7.9 81 
Total 452.2 595.8 48.9 59.1 


in each of the norms as well as the total frequency 
of the three ranks combined. It is clear that a 
marked increase in the frequency of the most 
popular response has occurred. The primary re- 
sponse has on the average increased in strength 
by almost 30% of its earlier value; while the most 
popular response was given by 289 students per 
thousand in 1927, in 1952 it was given by 375 
per thousand. This increase is also manifest at 
the second ranking response, but the difference 
has almost disappeared by the third response. 
Overall, the first three responses in 1952 ac- 
counted for the responses of 591 students per 
thousand, while in 1927 they accounted for only 
489 responses per thousand. The change toward 
more popular responses has obviously been very 
pronounced. 

Changes in Popular Response Words.—The 
two sets of norms were compared as to the iden- 
tity of particular response words. Words rank- 
ing first, second, and third in response to a given 
stimulus in 1927 were compared to the words 
ranking first, second, and third in 1952. The 
results are shown in Table 2. 

This table shows, as might have been expected, 
that the words which were primary responses in 
1927 tend to be the primary responses in 1952; 
71 of the 100 primary responses were identical. 
Fourteen of the 1927 primaries slipped to the sec- 
ond rank in 1952, eight to the third rank, and 
seven to ranks below that. The 1927 secondary 
responses were somewhat unstable; 17 moved up 
to become 1952 primaries, 14 fell to become 
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TABLE 2 


COMPARISON OF THE RESPONSE WORDS 
OccuRRING IN THE FIRST THREE RANKS IN 
THE 1927 AND 1952 Minnesota Norms 


1927 Norms 

1952 

Norms k 
Rank | Rank | Rank Lower | Total 
1 2 3 

Rank 1 71 17 5 7 100 
Rank 2 14 36 23 27 100 
Rank 3 8 14 21 57 100 
Lower 7 33 51 91 

Total | 100 100 100 91 


tertiaries, and 33 fell to lower ranks; only 36 re- 
mained the same. The 1927 tertiaries have an 
even greater dispersion, as the table shows. Over- 
all, 91 new words entered the top three ranks, 
and, of course, 91 older words moved out. It is 
clear that most of this turnover took place at the 
third rank, with the ranks becoming increasingly 
more stable as the strength of response increased. 

Strength of Primary Responses to Particular 
Stimuli —In the face of the facts that some pri- 
mary responses had changed and that primaries 
in general had greatly increased in strength, ‘it 
was of interest to enquire whether the relative 
strength of a primary response to a given stimu- 
lus in 1927 was related to the strength of the 
primary response to the same stimulus in 1952. 
Accordingly, three correlations were calculated: 
(a) the correlation between frequency of the pri- 
mary responses for all stimulus words on the two 
occasions, (b) the same correlation for the 71 
stimulus-response pairs which had not changed, 
and (c) the correlation for the 29 pairs for which 
the response had changed. The overall correla- 
tion (N = 100 pairs) was +-70; the correlation 
for same-response pairs was +.65, and correla- 
tion for pairs in which the response had changed 
was +.48. The three correlations are all signif- 
cantly different from zero. It is clear that the 
relative ranks of the strength of primary Te- 
sponses to particular stimuli remain fairly stable 
over time. Inspection of the scatterplots shows 
that the primary responses with high strength in 


the 1927 norms did not tend to change but sim- 
ply increased in strength as popular responding 
increased. The primary responses which changed 
tended to be those which were at relatively low 
strength in 1927. Such responses were displaced 
by close competitors in the 1927 norms which 
moved up to become new low strength primaries. 

The Total Response Distribution. —The overall 
distribution of frequencies of responses when 
plotted against response rank closely approached 
a Zipf-type distribution. When for the 1952 
norms the average frequency of each response 
rank was plotted against the rank on double 
logarithm paper, the result was almost a straight 
line from the first to the 90th rank, After the 
90th rank (at which point the average frequency 
was slightly less than one, indicating that at least 
some of the response hierarchies had already been 
exhausted) the curve fell away rapidly toward 
zero, This distribution may be compared to the 
rank-frequency distribution for the Kent-Rosanoff 
1910 norms reported by Skinner (20) and the 
rank-frequency distribution for 200 “other” stim- 
ulus words (which were not the K-R stimuli) 
collected by Schellenberg (18) and plotted by 
Cook and Skinner (2). The present distribution 
is appreciably more linear than either of the 
earlier distributions and is somewhat steeper, 
having a slope of approximately —1.4 as: con- 
trasted with roughly —1.3 for the Kent-Rosanoff 
distribution and —1.1 for the Schellenberg “other” 
words. 

The Nature of the Changed S-R Pairs.—The 
stimulus words which evoked differing primary 
responses in 1952 and 1927 are listed in Table 3 
with their old and new responses. The strength 
of each response is indicated by the percentage 
of Ss making that response. 

The writers are well aware of the difficulties 
of classifying associative pairs and are especially 
wary of “logical” as opposed to “behavioral” 
classifications. Nevertheless it is clear that at 
least nine of the 29 stimulus-response pairs in this 
group that were popular in 1927 were superordi- 
nate pairs (i.e., Butterfly-Insect, Red-Color, Cab- 
bage-Vegetable, Yellow-Color, Bible-Book, Sheep- 
Animal, Blue-Color, Lion-Animal, and Cheese- 
Food). If one permits a little ambiguity, the 
number of pairs classified as superordinate in- 
creases (i.e., one may add Whistle-Noise, Wish- 
Desire, Citizen-Man, etc.). The 1952 responses 
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TABLE 3 


STIMULI AND THEIR PRIMARY RESPONSES 
FOR Pairs WHICH WERE DIFFERENT 
IN THE 1927 ann 1952 Norms 


Percent- Percent- 
age of age of 
Stimuh 1927 Subjects 1952 Subjects 
mutus | Response | Making | Response Making 
the Re- the Re- 
sponse sponse 
Sickness | Illness 19 Health 37 
Mutton | Sheep 34 Lamb 36 
Comfort | Ease 13 Chair 12 
Short Long 33 Tall 39 
Butterfly | Insect 20 Moth 14 
Whistle | | Noise 17 Stop 13 
Wish Desire 20 Want 12 
Beautiful | Pretty 19 Ugly 21 
Window | Glass 26 Door 19 
Citizen | Man 21 U.S. 11 
Red Color 23 White 22 
Sleep Rest 25 Bed 24 
Working | Labor 11 Hard 13 
Earth Ground 17 Round 13 
Trouble | Sorrow 8 Bad 9 
Cabbage | Vegetable 40 Head 16 
Yellow | Color 29 Blue 15 
Justice Judge 14 Peace 25 
Bible Book 29 God 23 
Sheep Animal 18 Wool 20 
Bath Water 23 Clean 31 
Blue Color 24 Sky 17 
Stove Heat 25 Hot 23 
Doctor | Sick 15 Nurse 24 
Thief Robber 22 Steal 28 
Lion Animal 33 Tiger 26 
Joy Happiness 16 Happy 21 
Baby Child 18 Boy 16 
Cheese | Food 12 Crackers 11 


stand in sharp contrast to this. None of the new 
responses is clearly of a superordinate nature, and 
only two could be considered to be even mar- 
ginal superordinates (Sickness-Health? Butterfly- 
Moth?). It appears from these data that super- 
ordinate or classifying responses have decreased 
in popularity since 1927. 

The second major characteristic of the 1927 
responses seems to be synonymity, but it is diffi- 
cult to obtain reliable judgments of this relation 
which are separate from and independent of su- 
perordination. The reader may judge for himself 
how nearly synonymous the stimuli and responses 
are. 


The prevailing characteristics of the 1952 re- 
sponses seem to be, on the one hand, “coordi- 
nation” and, on the other hand, “completion” or 
some form of sequential relation. Frequently, 
these “types” are fused in the new stimulus- 
response relation. As an example, Cheese- 
Crackers may be thought of as a pair of coordi- 
nates within a class of edibles or as a highly fre- 
quent sequential combination in everyday lan- 
guage, “cheese and crackers.” The same is true 
of Lion-Tiger, Doctor-Nurse, Justice-Peace, Red- 
White, Sickness-Health, etc. Again, systems of 
classification are inadequate to express the dif- 
ference between the norms quantitatively. 

Overall, the changes in primary responses seem 
to suggest that current responses are more spe- 
cific, more concrete, and perhaps more deter- 
mined by high frequency verbal sequences than 
the older responses. The older responses which 
have been discarded tended somewhat more to- 
ward abstraction, with the most popular re- 
sponses being those of superordination and 
synonymity. 


DISCUSSION 


At this point the import of the contrast of the 
1927 and 1952 norms needs to be stressed. We 
had suspected from experimental evidence that 
the word association norms had changed, but we 
had not been aware of the nature of the change. 
Before the data were gathered, we had assumed 
that the 1952 sample would show less agreement 
and greater diversity in responses than the 1927 
sample. The 1952 sample was more heteroge- 
neous (a higher proportion of youth were in 
school; veterans on the GI Bill represented per- 
sons with diverse service backgrounds and in 
many cases individuals who would not other- 
wise have been in school, etc.); and the sample 
was one year advanced in education over the 
1927 group (which is supposed to make for more 
individual responding). In short, we anticipated 
exactly the opposite findings from those we dis- 
covered. Instead of being more heterogeneous, 
in word association the group was startlingly 
more homogeneous. There had been marked 
change, but it was almost all toward conformity. 

Clearly there was some stability in free asso- 
ciation data across the 25-year period. The high 
frequency responses tended to persist as the most 
popular responses, and the ranks of the most 
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popular responses tended to remain the same. 
On the other hand, we had found a greatly in- 
creased tendency on the part of the 1952 Ss to 
use the popular responses and a marked tendency 
to change low frequency primaries of the super- 
ordinate sort to other primary responses which 
tended to be more specific coordinates or comple- 
tions. 

The literature offered little aid in explaining 
our data. Kent and Rosanoff (11) had pointed 
out that their highly educated Ss made less use of 
the popular response words than the less well 
educated Ss in the norm group. Cook and Skin- 
ner (2) had used this finding to explain the dis- 
tribution of responses on Schellenberg’s (18) 200 
“other” stimulus words, where Ss had in fact used 
the popular responses less often and the infre- 
quent responses somewhat more often than Kent 
and Rosanoff’s Ss had for comparable ranks on 
different stimulus words. But Cook and Skinner 
were apparently not aware of the fact that on the 
100 K-R stimulus words where data were in fact 
comparable, Schellenberg’s sample used the most 
popular responses more often than Kent and 
Rosanoff’s “mixed adults.” 

We may at this point oppose Kent and Rosan- 
off’s finding and argue that increased education 
increases the use of popular responses, thus ac- 
counting for the differences between the 1927 and 
1910 norms and attributing the difference be- 
tween the new and the old Minnesota norms to 
the extra year of education, or we may argue that 
1952 sophomores were actually less well edu- 
cated than 1927 freshmen (and ignore the con- 
trast between the 1927 and 1910 norms) Of, 
finally, we may attribute the difference to some- 
thing other than educational level such as the 
methods employed in the various studies or the 
time-lapse and cultural change between samplings. 
The writers feel that the last alternative is the 
only one worthy of serious consideration. 

If one were going to attribute the difference 
to the method employed, one would presumably 
argue that the group-written form produces more 
common associates than the individual-oral form, 
thus accounting for the difference between the 
1927 and 1910 norms, and then feel free to intro- 
duce some other sort of variable to explain the 
difference between the 1927 and 1952 Minnesota 
groups. 7 

Woodworth and Schlosberg (25), however» feel 


that the obvious time-pressure of the oral-indi- 
vidual form of the test leads to even greater use 
of the popular associates than the group-written 
method. In their chapter on “Association” they 
compare the results of individual testing of Brown 
University students with the 1927 Minnesota 
norms and suggest that the great increase in the 
frequency of popular responses is a consequence 
of the difference in method. Inspection of the 
limited data they present, however, shows a 
striking correspondence to the 1952 norms, which 
suggests that the method itself may make only a 
secondary contribution to the high frequencies 
observed. 

An unpublished study by Clousing (7) seems 
to confirm the negligible effect of method (as long 
as both administrations suggest time pressure). 
In her study, no differences were found between 
the group-written and the individual-oral admin- 
istrations in the frequency of primary, secondary, 
or tertiary responses. Keene (10) asserts that 
there is little difference between the two modes 
of testing, although he does not present his group 
data for comparison with his tables of individual 
data. From the data currently available, explana- 
tion of the differences between norms in terms of 
the differences between individual and group test- 
ing does not seem satisfactory. 

‘At this point it was decided provisionally to 
accept the findings concerning the Minnesota 
norms as indicating general trends in associative 
behavior, independent of specific sample or 
methodological differences. We proposed to test 
the findings by examining them in the larger con- 
text of the 1910, 1925, and 1933 norms. We 
felt that three hypotheses were testable. 

1. Responses of Ss to the K-R stimuli are in- 
creasingly concentrated among the popular re- 
sponses for all United States adult samples. We 
proposed to examine the five norms to see if 
there was evidence for a steady increase in the 
frequency of the common responses. 

2. Words used as responses to the K-R stimuli 
tend to change slowly over time, with the highest 
ranking responses having the highest stability. 
The amount of change for any particular rank 
should be a function of the time elapsing between 
the collections of the norm data. 

3, “Abstract” responses to stimuli tend to de- 
crease in popularity across the time period en- 
compassed by this study. Specifically, if super- 
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ordinates are taken as representative of abstrac- 
tions, they will be found to occur less and less 
often as responses to stimuli, as the date of the 
norms becomes more nearly current. 

It should be noted that these hypotheses vary 
in the amount of support they receive directly 
from the Minnesota comparison already made. 
The first hypothesis has the most direct support, 
since the recent Minnesota norms had already 
been found to be more concentrated in the popu- 
lar associates than the earlier norms, which in 
turn were known to be more concentrated than 
the Kent-Rosanoff. Only if the O’Connor and 
Keene norms deviated from the trend could the 
hypothesis be denied. The second hypothesis, 
however, has many opportunities for confirma- 
tion or denial. Each set of norm data may be 
used as a standard, and the amount of difference 
between the standard set and other sets may be 
directly calculated. The third hypothesis simi- 
larly has several opportunities for real tests and 
may be relatively independent of the original data 
which suggested it (the data on changes in pri- 
mary responses alone). Here it was decided to 
obtain a relatively objective classification scheme, 
classify all superordinates for the stimulus words, 
and examine their frequency of occurrence in the 
several norms. The prediction, of course, was 
that a decrease in their use would be found over 
time. 


RESULTS OF GENERAL NORM COMPARISON 


Frequency of Popular Responses.—Table 4 
lists the frequencies of the most popular re- 
sponses (first, second, and third rank) for all 
five norm groups. It is immediately clear that 
the simple hypothesis of an increase in response 
frequencies over time will not by itself account 
for the data. The two norms which are closest 
together in time, 1925, and 1927, are widely 
deviate with respect to response frequencies. The 
1925 norm is very similar in frequency to the 
1933 norm and more like the 1952 norm than it 
is like the 1910 norm. The 1927 norm is most 
like the 1910 norm. The similarities and differ- 
ences here are remarkable in that they cut across 
the types of samples and methods of administer- 
ing the tests. 

It is tempting to suggest that two factors play 
a part in the variation found in our data. One 
factor would be the method of test administration 


TABLE 4 


AVERAGE FREQUENCY OF THE THREE Most 
PopuLar Responses TO 100 STIMULI 
IN THE Five NORMATIVE STUDIES 


Response | 1910 | 1925 | 1927% | 1933» | 1952* 
Rank 
First 260 | 338 | 289 | 357 | 375 
Second 122 | 134 | 121 | 125 | 136 
Third 77 | 80 | 79 | 78 | 80 
Total | 459 | 552 | 489 | 560 | 591 


a Singular and plural forms of responses were com- 
bined in computing frequencies to make these compara- 
ble to the other norms. 

è All frequencies expressed as frequencies per 1,000. 


(accepting the Woodworth and Schlosberg (25) 
suggestion that oral administration leads to higher 
popular response frequencies); the other factor 
would be the general cultural shift which pro- 
gressively increases the use of common responses. 
At any one time, one would expect the oral 
method to yield higher popular response fre- 
quencies than the group-written method, but over 
time both methods would show the increasing 
drift toward popular responses. The data for pri- 
mary response frequencies show parallel lines of 
increase when plotted for oral-individual and 
group-written administrations separately. 

In view of the fact that there is no experi- 
mental support for the first factor and that so 
little is known concerning the actual collection 
of the 1925 norms, this explanation must be left 
as a conjecture and as a suggestion and stimulus 
for further research. 

An empirical finding which is quite clear, how- 
ever, is that at least college populations currently 
show a very high usage of common responses. 
The data for Brown University students has 
already been mentioned as reflecting the high 
frequencies found in the 1952 and 1933 norms. 
In addition, support for this assertion is found in 
the normative compilation of Meals, Herrick, and 
Merow (12) at the University of Pennsylvania 
for 50 of the K-R words. These investigators 
apparently selected stimuli which elicited high 
response frequencies as indicated by the 1910 
or 1927 norms. Frequency data for the same 
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fifty stimulus words were abstracted from the 
1952 norms for comparison with the Pennsylvania 
data. The proportions responding at ach rank 
are paired for the Pennsylvania and Minnesota 
norms respectively: first rank, .52, .47; second, 
‚11, .13; and third, .07, ‚07. It is easily seen 
that the same intense concentration on popular 
responses is found for Pennsylvania college 
students. 

Overall, with respect to the first hypothesis, it 
may be said that the use of popular responses in 
the free association test is currently at a very 
high level in college populations, whether they 
are tested with the individual-oral or group-writ- 
ten method. The data strongly suggest that the 
frequency of popular responses is increasing over 
time, but that other factors in addition to the 
general trend are important in determining popu- 
lar response frequencies. 

Identity of Words Used as Responses.—Table 5 
lists the percentage of identical responses at each 


rank level for each pairing of the five normative ` 


groups. For the primary responses it is clear 
that the hypothesis that there is a steady change 
with time is acceptable. Each norm group shows 
most overlap with the norm group which is 
closest chronologically. The primary responses 
in the 1952 norms are identical with 73 of the 
primary responses in the 1933 norms, 71 of the 
primaries in the 1927 norms, 65 of the primaries 
in the 1925 norms, and only 58 of the primaries 
in the 1910 norms, Conversely, the 1910 norms 
show identical primaries with 81 responses in the 
1925 norms, 79 responses in the 1927 norms, 74 
responses in the 1933 norms, and 58 responses 
in the 1952 norms. Over the entire table in simi- 
lar fashion all the values are in accord with the 
general hypothesis. 

For the secondary and tertiary responses, simi- 
lar results are found with good conformance to 
the hypothesis except that the 1925 and 1910 
norms are most alike in both cases and the 1933 
norms are slightly less similar to the 1927 norms 
than would have been expected. From either 
extreme, 1910 or 1952, the progression is quite 
orderly, however. 

While the 1948 Pennsylvania norms referred to 
above are probably somewhat biased by the selec- 
tion of stimuli known to give high frequency Te- 
sponse, it is interesting to note in passing that 
they show very high agreement with the 1952 


TABLE 5 


PROPORTION OF IDENTICAL RESPONSES IN THE 
Five NORMATIVE STUDIES 


Primary Responses 
Norm 
1910 | 1925 1927 1933 1952 
1910 _ 8 79 74 58 
1925 81 = 83 79 65 
1927 79 83 — 80 71 
1933 74 79 80 — 73 
1952 58 65 71 73 — 
Secondary Responses 
Norm 
1910 | 1925 1927 1933 1952 
za gt} ES EB 
53 44 26 26 
= 48 47 34 
48 = 43 36 
47 43 — 46 
34 36 46 — 
Tertiary Responses 
1925 1927 1933 1952 
36 20 19 10 
= 32 29 19 
32 == 27 21 
29 27 > 29 
19 21 29 _ 


norms. Eighty-two per cent of the primary re- 
sponses are identical, 42% of the secondary re- 
sponses are identical, and 32% of the tertiary 
responses are identical. Inspection of Table 5 
reveals that these approximate very closely the 
percentages of overlap of the 1925 and 1927 
norms (83, 48, and 32%), which are separated 
by a similar short time period. 

Table 6 presents the other side of the consist- 
ency picture, showing how many different words 
are found in the first 300 response words for each 
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TABLE 6 


NUMBER OF Worps DIFFERENT IN THE 
Fırst THREE RESPONSES 


(300 words) 

Norm | 1910 | 1925 | 1927 | 1933 | 1952 
1910 — 69 79 94 123 
1925 69 — 66 71 95 
1927 79 66 — 88 91 
1933 94 71 88 — 77 


1952 123 95 9 77 — 


of the norm pairs. These data conform closely 
to the hypothesis except in the degree of differ- 
ence between the 1933 and 1927 norms. In all 
other respects the table is in accord with the 
hypothesis. 

Considering all the findings, it seems clear that 
a major determinant of response-overlap or re- 
sponse-identity in word association norms is the 
time interval between collections of data, Where 
deviations from this simple prediction pattern do 
appear, there is a suggestion that the norms de- 
rived by the oral-individual method are slightly 
more alike than would be expected from time 
lapse prediction. That other variables are prob- 
ably active is of course conceded, but it is 
stressed here that time lapse alone accounts for 
most of the systematic variation. 

Use of Superordinates —In order to obtain an 
unbiased and relatively objective standard for 
deciding which responses were superordinates, a 
written test was devised. The test consisted of a 
set of 100 sentences of the form “ is a 
member of the class of .” Each sentence 
began with one of the Kent-Rosanoff stimulus 
words. This test was given to 29 students in an 
introductory psychology laboratory course. A 
superordinate response was defined as any sen- 
tence completion that was. given by 15 or more 
of the students taking the test. We realize that 
this procedure is arbitrary in the extreme, but 
it includes the stimulus-response pairs which most 
judges would agree are “really” superordinates, 
provides an unequivocal criterion for counting 
responses, is unbiased with respect to the norma- 
tive data to be judged, and excludes words for 
which superordinates are, to say the least, diffi- 


cult to define. This procedure yielded agree- 
ment as to superordinate for 39 of the 100 K-R 
stimuli. These stimuli and their superordinates 
are listed in Table 7 with the frequency with 
which the superordinate was given. 


TABLE 7 


Worps DEFINED AS SUPERORDINATES BY THE 
CRITERION OF THE SUPERORDINATE TEST 


Super- Fre- Super- Fre- 
Stimulus | ordinate | quency | Stimulus | ordinate | quency 
Response | (N = 29) Response | (N = 29) 
Red Color 29 Swift Speed 21 
Yellow Color 29 Spider Insect 21 
Bread Food 29 Bed Furniture 20 
White Color 29 Ocean Water 20 
Green Color 29 Slow Speed 20 
Butter Food 28 Scissors | Tool 19 
Blue Color 28 Butterfly | Insect 19 
Black Color 28 Heavy Weight 18 
Lion Animal 27 Bible Book 18 
Sour Taste 27 Cabbage | Vegetable 18 
Sheep Animal 26 Blossom | Flower 17 
Cheese Food 25 Sickness | Health 17 
Hammer | Tool 25 Music Art 17 
Eagle Bird 25 Quiet Sound 16 
Chair Furniture 25 Lamp Furniture 16 
Table Furniture 2s River Water 16 
Bitter Taste 23 Salt Food 15 
Sweet Taste 23 Stomach | Organ 1S 
Fruit Food 22 Anger Emotion 15 
Mutton | Meat 22 


The occurrences of the superordinate responses 
to appropriate stimuli were counted in the 1910, 
1925, and 1952 norms where complete data were 
available. The 1927 and 1933 norms could not 
be used in this comparison because only the three 
most popular responses are available for each 
stimulus. Accordingly, a separate tabulation was 
made of the 26 superordinate responses which 
were present in the 1927 norms and the 18 super- 
ordinate responses which were present in the 
1933 norms. The data are presented in Table 8. 

Inspection of the table verifies the hypothesis 
concerning the decline of superordinate responses. 
The frequency of occurrence of the 39 super- 
ordinate responses identified has declined in the 
1952 norms to just slightly more than half the 
frequency manifested in the 1910 norms. The 
frequency of occurrence of the 26 superordinates 
present in the 1927 norms (Ist, 2nd, and 3rd 
ranks), even though biased upward for that 
norm set because of selection, shows a marked 
decrease from 1910 to 1952. The frequency of 
occurrence of the 18 superordinates present in 
the 1933 norms does not even appreciably re- 
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TABLE 8 


USE OF SUPERORDINATES AS RESPONSES 


1910 | 1925 | 1927 | 1933 | 1952 
39 Superordinates 

Average percentage | 17.5 | 184 | — | — 9.6 
26 Superordinates ® 

Average percentage | 24.7 | 22.3 | 226 | — | 139 
18 Superordinates® 

Average percentage | 31.1 | 28.3 | 27.9 | 224 | 19.2 


For the 1927 and 1933 norms only the first three responses are 
available. Twenty-six superordinates occurred in the first three 
ranks in 1927 and 18 superordinates occurred in the first three ranks 
in 1933, All those appearing in 1933 also appeared in 1927. The 
second and third rows of the table are biased upward for 1927 and 
1933, therefore, 


flect the selective bias but falls steadily from 1910 
to 1952. 

A detailed response-by-response comparison 
confirms this trend as a general case not attri- 
butable to a marked change in a few responses. 
Comparison of the 1910 and 1952 norms shows 
decreasing frequencies for 33 of the 39 super- 
ordinate responses. Further, for what might be 
called the clearest cases of superordination (those 
identified on the superordinate test with a fre- 
quency of 20 or greater), 24 out of 25 responses 
have declined. Comparison of the 1925 and 
1952 norms shows a decline in 31 of the 39 
superordinate responses and 23 of the 25 “strong” 
superordinate responses. The comparison of the 
1910 and 1925 responses, while not as Over 
whelming, is still conclusive; 30 of the 39 super- 
ordinates declined in frequency and 21 of the 25 
“strong” superordinates declined in frequency. 
The decline of popularity of the superordinate re- 
sponse appears to be strikingly demonstrated. 

Discussion. —The first hypothesis concerning 
the general increase in the frequency of popular 
Tesponses appears only partially confirmed. It 
is clear from the data that popular responses are 
currently at very high frequencies in college 
populations, but the five normative summaries do 
hot support the notion of a simple trend in this 
direction over time on all United States adult 
samples. The 1925 sample shows aà much 
greater use of popular responses than the 1927 
sample. The most parsimonious explanation at 
this time seems to be a two-factor explanation 
recognizing both the drift toward popular re- 
sponses over the time-span included in this study 
and a tendency for the oral-individual test tech- 


nique to generate more popular responses than 
the group-written technique. 

The second hypothesis concerning the change 
in identity of popular response words over time 
seems to be adequately confirmed. Norms col- 
lected closest in time show the maximum amount 
of overlapping in responses (regardless of sample 
differences), and high frequency responses prove 
to be much more stable than lower frequency 
responses. There is a further suggestion that test 
technique contributes to similarity of responses. 

The third hypothesis concerning the decline in 
the use of superordinates as word association re- 
sponses is likewise adequately confirmed. A clear 
and highly significant trend was discovered and 
the evidence further indicates that the more 
clearly the responses were identified as super- 
ordinates, the more marked the trend was seen 
to be. 

Interpretations of these findings are at best 
conjectural. The writers have been tempted to 
‘discuss the study in terms of the growth of “mass 
culture,” “other-directedness” and similar con- 
structs much used today in characterizing our 
culture. The enormous growth of the mass 
media, the standardization of school courses, pro- 
cedures and textbooks, and the presumably 
greater homogeneity of our verbal culture are 
appealing variables to invoke to explain the phe- 
nomena manifest in our data, but no evidence 
can now be marshalled to support such a dis- 
cussion. 

A much simpler explanation for much of the 
data suggests itself; namely, that test-taking atti- 
tudes have been changing over the time period 
discussed here. The work of Dunn, Bliss, and 
Siipola (4); Siipola, Walker, and Kolb (79); and 
to a lesser extent that of Flavell, Draguns, Fein- 
berg, and Budin (6) clearly establishes the im- 
portance of the time pressure set in producing 
popular and “superficial” responses, such as con- 
trasts and coordinates, and decreasing responses 
which have a synonymous Or superordinate re- 
lation to the stimulus. These modern studies of 
test-taking attitude mirror the effects achieved by 
Wells in a much older study of practice effects in 
free association. Wells (23) gave his Ss 20 asso- 
ciation lists of 50 words (one list per day) and 
then retested them on the first two lists. He ob- 
served a marked decrease in reaction time and 
a change to “superficial” associations. He re- 
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ports the qualitative changes in response for each 
of his six Ss, and for each one it is remarked that 
the frequency of superordinate responses de- 
creased (the decreases vary from one-half to two- 
thirds. of the original number of responses so 
classified). 

It would appear that practice effects (“test- 
wiseness”) and time pressure (“test-attitude”) 
bring about much the same effects: so-called 
superficiality of response, decrease in superordi- 
nates, and increased popular responding. If we 
assume that the oral-individual form maximizes 
time pressure and that more recent dates give us 
more and more test-sophisticated Ss we would 
seem to have all the necessary variables required 
to account both for the frequency data and super- 
ordinate data. 

It is true of course that “test-wiseness,” or 
“test-docility” of Ss is after all a general product 
of our culture and that this explanation may still 
be related to general questions of cultural change 
on a grand scale. However, this explanation has 
the virtue of greater specificity and seems to 
furnish a point of attack for further investigations 
which the more sweeping explanation fails to 
furnish. 

It should be noted beyond this, however, that 
regardless of the explanation given for the in- 
creasing frequencies of popular responses and 
decreasing frequencies of abstract responses, the 
systematic change in the overlapping of specific 
stimulus-response pairs as a function of time- 
lapse between normative studies is in the main 
separable from these phenomena. This system- 
atic change appears to the writers to be attribu- 
table not only to the drift away from abstract 
responses but also to changes in the “meaning” 
of words (e.g., “lamp” in 1910 apparently meant 
an “oil lamp” in contrast to its current meaning) 
and to particular recent experiences concerning 
the words which were not duplicated during the 
other norm-collection periods (e.g., in 1952, a 
strenuous presidential campaign was in progress 
and “whistle-stop” became a very popular asso- 
ciative pair). These changes could profitably be 
studied as part of a general study of the change 
in meaning of common words over time. 


' SUMMARY 


A comparison was made of word association 
norms collected in 1927 and in 1952 for Univer- 
sity of Minnesota students. It was found that 


primary responses had greatly increased in fre- 
quency and that the first three responses to the 
stimulus words accounted for 59% of all re- 
sponses in 1952 as contrasted with 49% in 1927. 
Seventy-one per cent of the primary responses 
were identical. Primary responses were found to 
be more stable than secondary or tertiary re- 
sponses. The frequency of the primary response 
to individual stimuli correlated +.70 between the 
two norms. Primary responses which had 
changed, tended to be low-strength superordinate 
responses which were replaced by coordinates and 
completions. 

Three hypotheses arising from the 1927-1952 
comparison were examined with respect to five 
major collections of free association data: the 
1910 Kent-Rosanoff norms, the 1925 O’Connor 
norms, the 1933 Keene norms, and the 1929 and 
1952 Minnesota norms. The first hypothesis, 
that there is a general tendency for the frequency 
of popular responses to increase with time, re- 
ceived partial confirmation. It was concluded, 
however, that other factors such as testing 
method probably exercised some effect in addi- 
tion to the time change. 

The second hypothesis was that the words used 
as responses to stimuli tend to change slowly but 
systematically over time with the highest ranking 
responses having the highest stability. This hy- 
pothesis was confirmed. 

The third hypothesis was that abstract re- 
sponses (specifically superordinates) have tended 
to decrease in popularity across the time period 
of this study. Superordinates were identified for 
the Kent-Rosanoff stimuli by means of a sentence 
completion technique. The hypothesis was con- 
firmed. 

Possible explanations for these phenomena 
were mentioned and it was suggested that a gen- 
eral change in test-taking attitudes might account 
both for the increase in popular responses and the 
decrease in abstract responses. It was further 
suggested that the change in individual response 
terms (aside from the frequency and abstraction 
changes) might be attributable to changes in the 
meaning of particular stimuli over time. 
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When a significant decision must be made about 
an individual it would seem wise to obtain as 
much relevant information as possible concerning 
that individual. If the decision to be made is 
related to personality assessment and evaluaticn, 
then, it would follow that several measures of 
personality, rather than one, be obtained from 
the individual. What measures to obtain, how to 
obtain them, and how to analyze them are im- 
portant questions confronting the personality 
assessor. Taft, in the first paper in this section, 
explores these and related questions. For exam- 
ple, he, also, examines one of the major over- 
riding problems in personality testing, that of the 
behavioral standards and criteria with which the 
data obtained from assessment are to be com- 
pared. By means of a number of examples of 
assessment techniques, Taft demonstrates clearly 
the difficulties, challenges and potentialities of re- 
cent research on personality assessment. 

Put somewhat crassly, one challenge facing the 
researcher in the area of personality assessment 
is that of what to do with all of the test scores 
and information obtained. When an assessor has 
available a large amount of data, how does he 
proceed from the mere listing and enumeration 
of his data to the making of a relatively small 
number of statements and generalizations con- 
cerning personality functioning? In the article 
written by Cattell, a factor analytic solution to 
this problem is outlined. By means of the tech- 
niques of factor analysis, a great number of test 
scores may be accounted for in terms of a small 
number of underlying factors. Cattell’s paper in 
addition to outlining the ways in which factor 
analytic methods can be applied to personality 
assessment also suggests ways in which factor 
analysis can be related to theory construction in 
the area of personality. 

Another, and rather different analysis of the 
role of theory in personality assessment, is pre- 
sented by Rotter in the third article of this sec- 
tion. Rotter argues that for progress to occur in 
the field of personality assessment there must be 
greater recognition of the test situation as an 
interpersonal, behavioral event. This approach 
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requires the tester to ask such questions as: Why 
should I use Tests X, Y, and Z rather than some 
other combination of the dozens of other avail- 
able tests? Under what conditions shall I require 
subjects to respond to the test? What predictions 
of subject’s behavior do I want to make on the 
basis of the tests administered? Some guide lines 
along which answers to questions such as these 
may be considered are presented by Rotter in an 
outline of his social learning theory of behavior. 


MULTIPLE METHODS OF 
PERSONALITY ASSESSMENT * 


RONALD TAFT * 


The term “personality assessment” refers to any 
procedure aimed at describing a person’s charac- 
teristic behavior by categorizing him with respect 
to some communicable dimension or dimensions.* 
Since the OSS assessment procedures, however, the 
term has tended to be pre-empted for the proce- 
dure where several different types of assessment 
techniques are applied to the subjects and the 
final assessments are made by the combined judg- 
ments of several assessors concerning the subjects’ 
predicted behavior outside of the assessment situa- 
tion. These procedures are “multiple” in two 
senses: with respect to the techniques and with 
respect to the assessors. 

Our treatment in this paper will deal with the 
basic logic of this type of assessment, and the dis- 
cussion will be illustrated by the best known mul- 
tiple personality assessments, details of which are 
outlined in Table 1.3 Each one of these assess- 

* Reprinted by permission from the Psychological 
Bulletin, September, 1959, Vol. 56, No. 5, 333-352. 

1 The author expresses his thanks to the colleagues 
who have discussed various points in this paper with 
him, especially to James Lumsden; also to Saul B. 
Sells for his valuable comments. 

2This is the same procedure as “instantiating a 
person object in a module or set of modules,” a 
terminology which the writer has preferred in another 
context (36), but which is avoided here in the inter- 
ests of communicability. 

3 Insofar as the assessments use multiple techniques, 


ments has, in its own way, constituted a milestone 
in the history of multiple personality assessment. 

The researches into personality conducted at 
Harvard in the 30’s under the direction of Murray 
(31) were the first to use the typical procedures 
of personality assessment—diagnostic committee 
assessments of personality based on interviews and 
a varied battery of objective, projective, and situ- 
ational tests. However, unlike the later assess- 
ments, no outside criterion was used in these Har- 
vard studies, and, therefore, no more than passing 
reference will be made to them. The same ap- 
plies to the continuing series of studies of per- 
sonality carried out by Cattell and his students 
(3) which started to employ external criteria only 
at an advanced stage of its progress. The British 
War Officer Selection Boards (WOSB), which 
were inspired by the German officer multiple 
technique selection procedures (9), pioneered the 
use of a quasi-natural social situation, including 
the leaderless discussion, as a basis for judging 
the potential social skills of the candidate. They 
also produced the first validation material on mul- 
tiple assessment procedures as a means of selec- 
tion. The British Civil Service Selection Boards 
(CISSB) continued this work, with more emphasis 
on the validation of individual techniques as well 
as the technique as a whole. The OSS assess- 
ment highlighted the psychological problems in- 
herent in assessment and won many supporters 
for the value of combining multiple tests and ob- 
servations by pooling the judgments of several as- 
sessors; the Michigan VA assessment program did 
much to upset that support while the Chicago and 
Menninger assessments reinstated some of it 
through their promising findings. The California 
Institute of Personality Assessment and Research 
(IPAR) differs from the other assessments in em- 
phasizing research into personality to a greater 
extent. 


THE ORIENTATION AND PURPOSE 
OF PERSONALITY ASSESSMENTS 


Three foci of assessment can be distinguished: 
human performance in some socially defined situ- 
ation or situations (the criterion performance); 


the problems of inferring the predictions and validat- 
ing the tests are the same as those involved in other 
multi-variate procedures. See, for example, the treat- 
ment of these problems in Thorndike (42). Our 
emphasis here will be mainly on the problems that 
arise from the combination of multi-variate proce 
dures and multiple assessors. 
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performance in defined assessment situations, i.e., 
tests (the assessment performance); and the link 
between these two performances (test validation). 
Different assessment programs have been oriented 
towards one or more of these aspects depending 
on their primary purpose (see Table 1). The 
orientation towards criterion performance implies 
the primary purpose of assessing candidates with 
respect to the criterion in order to select or reject 
them. The orientation towards the assessment 
performance is concerned with the validation of 
the assessment techniques themselves, while the 
orientation towards the link between performances 
is concerned with research on the functioning of 
personality. 

Selection was the original purpose for the 
WOSB, CISSB, and OSS assessments; in each 
case, the assessors were presented with the imme- 
diate problem of selecting from a given group of 
candidates those who would make the most ade- 
quate army, civil service, and secret service, offi- 
cers, respectively. After a consideration of the 
personality requirements of the positions for which 
they were selecting officers, the assessors judged 
the candidates on the basis of techniques chosen 
either because they appeared to have face validity 
for measuring these requirements or because the 
assessors were familiar with their use. At least 
in the case of wartime assessments, neither the 
time available nor the conditions permitted more 
scientific procedures than that, and it was hoped 
that accuracy would be achieved through weight 
of numbers (of techniques and of assessors). 

Test Validation (and Construction) —Some of 
the later assessments, notably the VA study, set 
as their short run aim the task of developing and 
validating techniques for future use in selection. 
The validation studies were applied not only to 
the individual items and tests, but also to the 
purely subjective techniques, such as group ob- 
servations and interviews. In some of them, €.8., 
Menninger, the individual judges were also vali- 
dated as though their judgments were scores on 
a test. When we speak of the validity of assess- 
Ment techniques it is important to include these 
judgments among the techniques, as they vary 
greatly in their accuracy. 

The Harvard studies were the first to use mul- 
tiple assessment techniques for personality re- 
search, and the outstanding recent example is the 
IPAR work at California. (The large-scale factor 
analytic studies of personality (e.8., 3, 8), did not 


use the combined judgments of several assessors.) 
The Harvard studies dealt with the correlations 
between different performances that were elicited 
in the test situation, whereas the IPAR studies 
were concerned, in addition, with the relationship 
between the assessment performances and cri- 
terion measures such as ratings by university 
teachers of the subject’s professional potential, his 
originality, and his personal soundness. The 
Michigan studies of clinical psychologists were 
similar in orientation. 

Most of the personality assessments have tried 
to pursue more than one of the above purposes at 
once, but there are drawbacks to such attempts 
at economy. For example, an attempt was made 
in the CISSB studies (44) to combine selection 
with validation, but the validation indices were 
lowered and distorted by the attenuation of the 
sample through rejection of candidates. The low 
validities obtained became remarkably high (for 
that sort of prediction) when a correction for se- 
lection was applied, but such corrections are only 
arbitrary estimates. The use of assessment pro- 
cedures for selection implies that the procedures 
have already been validated, but this has usually 
not been the case. The assessors have either had 
to use whatever prior knowledge they possessed 
about the validity of the techniques for the pur- 
pose at hand, or they have had to base their pre- 
dictions on the relevant postulates in their theory 
concerning the link between the assessment and 
the criterion behavior of the subjects. For ex- 
ample, the assessors presume that the situational 
tests in the assessment program have what Cron- 
bach and Meehl have termed “content” validity 
(5). But in selection, this type of validity can 
be regarded only as a holding procedure for an 
ultimate “predictive” validity. Where the cri- 
teria are imprecise and not repeatable, or where 
selection is urgent, a separate validation study 
may not be practical, and under these circum- 
stances there is no alternative to conducting se- 
lection without prior validation. It still may be 
possible, however, over a period of time, to uti- 
lize the imperfect validational material that be- 
comes available in order to improve the existing 
selection procedures. This seems to have been 
the case, for example, in the OSS studies. 

Validation studies of the assessment techniques 
also logically precede the use of those procedures 
for personality research, although techniques used 
in such research often are accepted on the basis of 
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their face validity. To use the one and same study 
to validate the techniques and to use them to 
measure personality is lifting oneself up by one’s 
boot-straps. In fact, however, the assessments 
which attempt to carry out this dual purpose ob- 
tain independent support for the “boot-strap lift” 
from already existing information regarding both 
validation and the functioning of personality. 


(Wide sample of techniques used: individual and multiple interviews; observation of group 


Assessment 
Harvard (31) 1934-37 | Young men, mainly 
Harvard undergrad- 
uates (paid subjects) 
WOSB (14, 30) 1942-45 | British officer can- 
didates 
oss 1944-45 | U. S. Intelligence 


and espionage agent 
candidates 
Michigan, VA (21) 


1946-49 | Clinical psychology 


graduate students 


California IPAR 1950-51 | Advanced graduate 
(Various published students 
and unpublished re- 


reports, e.g. (2, 13)) 


Chicago (39) 1952-54 | Students in theol- 
ogy, education, and 
arts 

Menninger (16) 1946-52 | Psychiatric training 


candidates 
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Even then, the interpretation of personality re- 
search projects that do not commence with a pilot 
study on the validity of the instruments is al- 
ways subject to doubt. How do you know that 
expressed hostility to authority figures on the 
TAT measures suppressed rebellious tendencies? 
How do you know, when assessor X observes a 
subject to be dominant, that he is dominant? 


TABLE 
DETAILS OF MILESTONE 


activities 


Strategies 
(in order es 
of im- 
portance) 
ates aS) Se 
Personality | Analytic Psychologists 
research 


Selection Analytic | Army officers, psychia- 
Global trists, and psychologists 
Selection Analytic | Psychologists, psychia- 
Global trists, and other social 
scientists 
Validation of | Empirical | Psychologists (clinical 


techniques Global and nonclinical) 


Personality | Empirical Psychologists 
research; val- | Analytic 

idation of Global 

techniques 

Validation of | Analytic | Psychologists 


techniques 


Selection, Global Psychiatrists and psy- 
validation of | Analytic | chologists 
techniques Empirical 
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How do you know that observed “role empathy” 
in a role-playing test is a valid predictor of social 
skill? Such questions can be answered only by 
the progressive refinement of validity information 
and personality theory. 

Assessment procedures usually rely on many 
unvalidated tests, and when the correlations be- 
tween the tests are used as a means of studying 
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personality—as in the case, for example, of the 
Harvard and IPAR studies—it is necessary to de- 
cide whether these correlations are to be treated 
simply as validity indices, or whether the valid- 
ity of the tests will be assumed and the correla- 
tions treated as throwing light on the relation- 
ships between different personality structures. The 
problem of simultaneous validation of tests and 


and situational tests; objective, projective, and performance tests; “made to measure” inventories) 


Criterion aaa 
Analysis se Main Criterion Some ae TE 
Method 
— Committee No external criteria — 
Personal 1. Committee | Supervisors’ reports 1. CISSB committee 0.13-0.25. 2. Review 
knowledge and 2. Final Board 0.23-0.41. (When corrected for 
Review Board selection the range of validities is 0.50- 
0.66.) 


Intuitive and in- 
terview with ex- 
perts 


Personal 
knowledge 


Personal 
knowledge 


Committee job 
analysis and in- 
terviews with 
teachers 


Committee job 
analysis and 
success and 
failure 


Committee 


Individual and 
pooled ratings 


Committee 


Committee 


Individual and 
averaged rat- 
ings of inter- 
viewers 


Field reports by the asses- 
sors and by field command- 
ers on several molar traits 


Ratings by clinical teach- 
ers and supervisors on sev- 
eral aspects of clinical work 


1. Teacher’s prediction of 


student’s professional po- 
tential. 2. Teacher's rat- 
ings of personal soundness 


Teacher's judgments and 
exam results 


Supervisor's ratings 
(pooled) on specific and 
general competence 


“Over-all” ratings 0.08-0.53 (varying with 
assessment group and criterion). Rating 
of “Effective Intelligence,” 0.33-0.53 


“Over-all” rating and clinical competence, 
0.37. Miller Analogies and clinical com- 
petence, 0.35. Strong Interest Key for 


1. Cross-validated i 
2. Committee rati! 


Very high validiti 


Interviews (global), 0.24. Interviews (an- 
alytic), 0.26. Tester’s analytic ratings on 
projectives, 0.27. Objective scoring of pro- 
jectives cross-validated at zero. Best in- 
terviewer (all data), 0.57 
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the study of personality is related to the problem 
of “concurrent” and “construct” validity. By set- 
ting up some of the behavioral measures made 
during the assessment as tentative criteria, it is 
possible to validate other assessment measures 
against these. Cronbach and Meehl call this con- 
current validation, and it is one way of utilizing 
previous knowledge of validities by choosing cri- 
teria measures that have reasonably well-estab- 
lished reliabilities and validities. Then on the 
basis of all that is known about these measures, 
their implications for the understanding of per- 
sonality can be explored further by a strategy of 
construct validation. The data collected during 
the assessment can be added to the “nomological 
nets” already used in thinking about the particu- 
lar personality constructs and new hypotheses de- 
veloped for investigation in later studies. Thus, 
even an assessment program that is aimed primar- 
ily at the purpose of selection can make a con- 
tribution to personality research through construct 
validation. (The place of construct validity in an 
assessment program is discussed more specifically 
below under the heading of analytic strategy.) 
This concept also enables an assessment program 
to avoid the problem of the priority of validation 
of instruments (versus conducting personality re- 
search) by conceiving both validation and person- 
ality research as two aspects of the one endeavor, 
both aspects gradually throwing light on each 
other as more and more data accumulate. 

But this double-aspect approach of construct 
validation is an uneconomical process. Refine- 
ments may often be made more readily to our 
personality theory or to our knowledge of the 
validity of the techniques by a more direct ap- 
proach to one or the other. In this case the prob- 
lem of priorities which we have discussed cannot 
be avoided. 


THE PREDICTION STRATEGIES IN ASSESSMENT 


The Criterion.—All assessment programs in- 
volve studies of the link between two or more 
pieces of behavior, whether the primary purpose 
be selection, validation research on tests, or per- 
sonality research. Some of this behavior is known 
as assessment behavior and some as criterion be- 
havior. These concepts are analogous to the in- 
dependent and dependent variables in experi- 
mental psychology, and it is an arbitrary decision 
by the experimenter which one is designated as 
which. Most of the reports of assessments have 


devoted some space to the criterion: problem, es- 
pecially the report of the Chicago assessments 
(39). Most of the problems are similar to those 
involved in the validation of multivariate objec- 
tive techniques discussed, for example, by Thorn- 
dike (42). 

A special problem that arises in personality as- 
sessment is the frequent unreliability of the cri- 
teria which so often represent subjective judg- 
ments that vary from one criterion rater to 
another. This unreliability imposes a serious lim- 
itation on the potential validity of personality 
assessments, and it makes it difficult to evaluate 
some of the low validity coefficients reported. 

The designation of the criteria of performance 
is determined by the circumstances of the assess- 
ment, and usually must be taken for granted by the 
assessors. Thus, in the Chicago study the as- 
sessors explicitly accepted the principle that the 
criterion ratings represented the predilections of 
one or more supervisors with whom the subjects 
interacted in the criterion situation, and that the 
assessors’ predictions of the subjects’ success must 
be made in reference to the “psychological job 
requirements” implied by these predilections and 
interactions. The assessment strategy should be 
aimed at the criterion, once the latter has been 
established. Kelly (79) did not accept this prin- 
ciple in his researches on medical school selec- 
tion. In this study he analyzed the criterion 
measures and found that there were at least 
three, and possibly four, types of medical per- 
formance which could be predicted independently. 
In the long run, however, a selection program has 
to choose between the independent criteria, or the 
criteria have to be combined by some type of 
simple, weighted, or complex, interactional sum- 
mation, or by taking account of one critical in- 
stance. 

A complication that arises in criteria analysis, 
such as that of Kelly, is that an assessor can only 
predict to indices of the criteria, not to the actual 
criteria themselves. It may be possible in some 
instances for the assessor to demonstrate that an 
index used in assessment has a low correlation 
with some more satisfactory, although less ac- 
cessible, criterion index; for example, that aca- 
demic grades in medicine do not represent the 
doctor’s subsequent service to the community as 
a practitioner. Assuming that the latter is ac- 
cepted as the more fundamental in medical prac- 
tice, the assessors should predict to it rather than 
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to academic grades by trying to obtain some ac- 
cessible index which more realistically measures 
this criterion of community service. Sometimes 
the assessors may be able to convince those who 
control the criterion ratings that the indices which 
the latter are using are not consistent with their 
fundamental criterion, but eventually the assessors 
and the criterion raters must agree on some cri- 
terion index in accordance with the policy of the 
organization. Otherwise it would be absurd to 
speak of the validity of the assessment. 

Three types of strategies can be distinguished 
for predicting the criteria performance: naive 
empirical, global, and analytic, and we shall now 
consider each strategy in detail. 

1. Naive empirical. This refers to the classi- 
cal method of test construction, adapted from ap- 
titude testing, in which the inclusion in a selection 
program of a test—or test item, which may be 
treated for our purposes as a separate test—is 
determined mainly by its predictive validity, i.e., 
by the degree to which it correlates with or dis- 
criminates a specified criterion. Tests that are 
not sufficiently valid are either dropped from the 
program or amended and no consideration is given 
to the meaning of the test behavior, except as an 
afterthought. The naive empirical strategy, thus, 
is one in which inference proceeds directly from 
test to criterion without the mediation of inter- 
vening variables. 

Not a great deal of use has been made of this 
empirical strategy in multiple personality assess- 
ment, partly because of intellectual resistance to 
atheoretical procedures on the part of personality 
researchers, and partly because of the absence of 
reliable criteria. The outstanding examples of 
the use of the naive empirical strategy are found 
in the IPAR studies, especially in the scales of the 
California Psychological Inventory and the Ad- 
jective Check List compiled by Gough. These 
scales, which give unit weight rather than beta 
weights to the tests, i.e., items, enable predictions 
to be made to quite complex behavioral criteria, 
for example, tolerance, delinquency, academic 
achievement, neurodermatitis, potential social sta- 
tus (Gough, unpublished bibliography, IPAR, 
1955). 

The naive empirical strategy has the advan- 
tage over other strategies of objectivity and also 
of enabling assessors to predict complex and little 
understood behavior. But it also has serious 
limitations: it can be used only where suitable cri- 


teria groups are available for validation and cross- 
validation, and the validities may “drift” owing to 
changes in significant aspects of the conditions— 
temporal, geographic, public attitudes and infor- 
mation, set of the subjects, etc. Either some un- 
derstanding of the underlying theoretical factors 
is necessary to provide a warning system against 
“drift,” or constant revalidation must be carried 
out. 

The primary purpose served by the naive em- 
pirical approach is that of constructing, and vali- 
dating assessment instruments, although the long 
range purpose can be both selection and research 
on personality. Up to a point, the personality re- 
search aim can be served simultaneously with the 
validation aim, since the discovery of the inter- 
correlations between the tests themselves and the 
criteria can suggest personality constructs. But 
we are now back on the problem of priorities: we 
can use validation studies for personality research 
only if we already possess postulates about the 
significance for personality of the behavior tapped 
by the tests and the criteria. 

In this reference we should briefly consider the 
sources of the test items that are used in the vali- 
dation “tryouts.” The sources may be naive em- 
pirical, or they may be theoretical. Empirical 
sources include: tests in the general area that are 
traditionally used, those that are readily available 
and can conveniently be given, tests whose title 
or item content bear a superficial relationship to 
the criterion, and tests which have previously been 
shown to relate to the criterion. Theoretical 
sources of tests, on the other hand, include the 
systematic or unsystematic sampling—usually the 
latter—of the areas of personality that are con- 
sidered by the researcher to be relevant to the cri- 
terion behavior. The empirical outlook of the 
student who is developing personality assessment 
techniques is seldom so naive that it is entirely 
uninformed by theoretical considerations, so that 
the “naive empirical” approach in practice tends 
to become mediated by intervening structures and 
thus to approach the analytic strategy described 
below. The intervening structures, however, are 
not made explicit in this empirical approach. 

2. Global. This is the second nonmediated 
strategy, in which the assessor relies on his intui- 
tion, empathy, and verständnis processes to pro- 
vide the predictions, rather than using statistically 
established associations between assessment be- 
havior and criteria. If any analysis is made of the 
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criterion in a global strategy it is directed at the 
social role expectations for the criterion perform- 
ance rather than at the required personal qualities 
for successful performance. (The latter is more 
appropriate to the analytic strategy discussed be- 
low.) Information may be given to the assessors 
about the subject’s performance on objective tests, 
and even concerning the validity of these tests— 
for example in the Menninger studies (/5)—but 
the ultimate assessment is a global one. This 
procedure is the personality assessor’s answer to 
some of the drawbacks of the empirical strategy. 
Intuitive predictions can be used when the as- 
sessors have only a vague concept of the criterion 
conditions, but empirical methods require clear- 
cut criteria and expendable samples of trial sub- 
jects who have been rated on these criteria. Where 
this is impossible, as in the case of the OSS 
studies, the empirical strategy cannot be used, 
and intuition must be resorted to. 

The distinction between empirical and global 
strategies also is analogous to the distinction be- 
tween narrow and wide-band techniques (6), the 
former enabling comparatively more reliable but 
limited predictions. Supporters of the global 
strategy have claimed for it special adaptability 
to the vagaries of the conditions associated with 
both the assessment and criterion situations. Some 
writers also claim for it a special virtue in con- 
nection with personality research in that it avoids 
the violation of a “whole” person inherent in 
trait psychology; however, it is very doubtful 
whether it is correct to use the word “research” 
to describe a mode of study which, if it were ap- 
plied in its pure form of global verständnis, would 
by definition preclude communication of the as- 
sessments, 

The value of the claims of the global strategists 
to have improved on empirical validation as a 
basis for selection programs is limited. Subjec- 
tive methods of making predictions have seldom 
been shown to be superior to objective methods 
where these are available, excepting in the case 
of especially competent assessors (see below). 
The relative competence of the assessors in mak- 
ing predictions about the subjects is analogous to 
the relative validity of the tests, and both can be 
established by the same type of validation tech- 
niques. In this way the empirical and the global 
strategies are similar in orientation: the proviso 
that incompetent assessors should either be elim- 
inated from the assessment panel or trained to 


eliminate errors is analogous to the dropping or 
amending of an invalid test in the empirical 
strategy. 

The “nonanalytic” techniques used in the global 
strategy are not necessarily nonmediated by per- 
sonality constructs, even though these constructs 
may not be made explicit. The process of moving 
from observations of behavior to inferences about 
future behavior uses a set of postulates about per- 
sonality and various derived premises; these prem- 
ises involve certain personality constructs or cate- 
gories into which the assessor places the behavior 
of the subjects, i.e., he “instantiates” the behavior. 
Intuitive inferences, even empathic ones, can be 
reduced to this formulation which provides a 
bridge between analytic and nonanalytic processes. 
This point is elaborated by Sarbin, Taft, and 
Bailey (36). 

3. Analytic. The analytic strategy makes ex- 
plicit the role of mediating constructs in predic- 
tion. A two-stage inference is involved; first, 
there is an inference from the criterion require- 
ment to the traits that are relevant to that per- 
formance (the “criterion analysis”); and, sec- 
ondly, an inference from the subject's observed 
behavior and test performance to his status on the 
trait dimensions (the assessment). Research on 
the validity of these inferences requires two sepa- 
rate studies: one of the validity of the analysis 
of the criterion requirements and the criterion 
indices, and one of the validity of the tests as 
predictors of the criterion. These validation stud- 
ies should be based on independent samples of 
behavior and, for preference, on independent sam- 
ples of subjects, the research on the criterion anal- 
ysis to precede the validation of the instruments. 

The importance of criterion analysis was recog- 
nized in each one of the “milestone” assessments, 
but the validity of the analysis is usually assumed. 
Two types of approach to the criterion analysis 
problem have been used: intuitive and empirical. 
The intuitive approach is the one usually used in 
personality assessment; typically the assessors have 
used either the testimony of “experts” or their 
own theoretical analysis to determine the cri- 
terion requirements. These analyses rest on a 
theory of personality, but the theory is usually not 
made explicit, nor is it subjected to empirical 
validation. 

The empirical approach to criterion analysis 
can employ subjective or objective methods. The 
Menninger studies, for example, employed sub- 
jective rating methods to compare the character- 


l 
: 


PERSONALITY ASSESSMENT 91 


istics of successful and unsuccessful psychiatrists.* 
The VA assessment program was, among other 
things, one big empirical criterion analysis using 
both subjective and objective methods. The study 
began with no explicit analysis of the require- 
ments in clinical psychology and ended with an 
explicit description of some of the characteristics 
which relate to success in various aspects of that 
profession. In a sense, all preliminary validation 
try-outs of tests in a naive empirical strategy, such 
as those used in the VA and IPAR programs, con- 
stitute a criterion analysis. The cross-validation 
that follows may thus be regarded as testing a 
series of hypotheses about the criterion behavior. 
Referring once more to Cronbach and Meehl’s 
contribution (5), we see now that the analytic 
strategy is a type of construct validation which 
attempts to augment the “nomological net” sur- 
rounding the relevant constructs. 

The main difficulty with the analytic method of 
assessment is that it requires a set of constructs 
which may not exist in our present state of psy- 
chological knowledge—although the assessment 
results may contribute to the development of such 
constructs. The difficulties which factor analysts 
often encounter in their attempts to label their 
factors leads one to sympathize with Cattell’s 
preference for using reference letters and numbers 
rather than trying to find meaningful labels for his 
personality factors (3). The analytic method, 
then, is limited by the current state of develop- 
ment of personality theory. A further drawback 
of a thoroughgoing analytic method of assess- 
ment is the practical consideration of economy of 
effort; the returns may be just as great, probably 
greater, in the first pilot assessments in a pro- 
gram, if we use an empirical or global strategy 
without trying to make explicit the underlying 
theoretical relationships. In addition, analytic 
assessments require a double inference and con- 
sequently the possibilities of error are increased; 
either the criterion analysis or the ratings of the 
candidate might be in errors. In the analytic 
strategy, however, there is at least the hope that 
the sources of these errors will be discovered and 
corrected, whereas the sources are masked in the 
nonmediated strategies. 

4 Knowledge of the results of this analysis did not 
improve the validity of the assessor's predictions, but 
this could have been caused by the assessors prefer- 
ring to use a global rather than analytic strategy 
despite the analytic information which was supplied 
(15). 


The analytic strategy is applicable to any of the 
three purposes, selection, validation research or 
personality research, but its greatest potential is 
in the latter; in fact, if the results of assessment 
are to be of any value in increasing our under- 
standing of personality, it is essential that the data 
be expressed in terms of basic personality con- 
structs underlying the subject’s behavior so that 
the scores and observations on the subjects may 
become meaningful. This applies both to naive 
empirical strategies such as factor analysis or blind 
item validation, and to global strategies in which 
the mediating constructs are not made explicit. 

To sum up: we have argued that both the naive 
empirical and the global strategies are actually 
mediated by analytic personality constructs, but 
that it is not always necessary, or even possible 
to make those mediating variables explicit. This 
may apply both when the purpose of the assess- 
ment is validation of the techniques or the carry- 
ing out of an actual selection. But when the pur- 
pose is personality research, some explicit handling 
of the constructs is advisable. The concept of 
construct validation supports this requirement by 
merging the validation and the personality re- 
search orientations, 

Each of the three strategies has its particular 
uses in assessment programs. Where mass screen- 
ing is required, the empirical strategy is usually 
best, if possible; where the criterion situation is 
complex and unrepeatable, but familiar to the 
assessors, the global approach is to be preferred, 
and where the relevant personality theory has at- 
tained a sufficient level of development, the ana- 
lytic strategy is indicated. Where none of the 
basic requirements are present—a repeatable and 
reliable criterion, familiarity of the criterion to 
the assessors, or appropriate personality theory— 
the assessors have to choose the strategy that 
seems best, although no strategy can really redeem 
such a hopeless situation. In general, personality 
assessors being what they are, they will prefer a 
largely intuitive approach, either analytic or glo- 
bal, as they did in the WOSB and OSS situations, 
but an increasing respect seems to be paid to the 
need for illuminating these intuitive methods by 
empirical analysis wherever possible. 


SOME SPECIFIC ISSUES IN ASSESSMENT 
AS A METHOD OF PREDICTING 


Clinical Versus Statistical Approaches.—We 
have argued that there are occasions when intui- 
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tive methods of making predictions, i.e., “clinical” 
have their appropriate place. Statistical methods 
cannot be used where no prediction formula exists. 
But some personality assessors speak as if the 
clinical method is always to be preferred as it 
enables the assessor to be flexible in his use of the 
data in a way that is not possible with statistical 
techniques; for example, the clinician can give 
weight to obvious but rare and nonrepeatable fac- 
tors in the subject’s current situation which could 
not be validated empirically. Other advantages 
claimed for the clinical against the statistical ap- 
proach are that it does not violate the essential 
unity of the subject's personality, and that it en- 
ables the use of empathy and recipathy in mak- 
ing the predictions. (Actually these subjective 
clues could also be used as data by the statistician 
along with other more objective data.) 

Other assessors regard clinical techniques as 
only a last resort. A number of advantages can 
be quoted for statistical prediction over clinical, 
most of which boil down to the fact that the 
statistician has a far more efficient memory and 
a larger attention span than the clinician; he can 
“remember” the relevant data at the appropriate 
time and combine them with other data in order 
to obtain optimal weightings for future predic- 
tions. 

And so we have, on the one hand, the efficient 
but rigid and inhuman statistical prediction, and 
on the other, the flexible and humane but ineffi- 
cient clinical. Which one is more useful in per- 
sonality assessment? There are several discus- 
sions of this question available (e.g., 4, 29, 15, 
36, 27) so the points will not be elaborated fully 
here excepting in so far as they directly affect 
multiple personality assessment procedures. 

The weight of the evidence clearly supports the 
accuracy of the statistical approach compared with 
the clinical. Meehl’s notorious scoreboard (29) 
recording the relative validity of clinical versus 
Statistical prediction mounts grim evidence in 
favor of the latter. Holt (15) criticizes Meehl’s 
summary on the grounds that most of the studies 
pitted sophisticated, actuarial predictions against 
“naive clinical,” while others (e.g., Wittman’s) 
actually showed the superiority of “sophistical 
clinical” over naive clinical methods. But Meehl 
is quite clear about the rules of his contest: the 
rival methods start off with approximately the 
same objective and subjective data, although in 
some of the studies the clinician used additional 


subjective data. The important difference is that 
the reported statistical predictions were based on 
the naive empirical method of validation, while 
the clinical were either global or intuitively 
analytic. The statistical approaches were not 
concerned with the meaning of the correlations 
between the data and the criteria, although the 
use of cross-validation and statistical refinement 
meant that the empirical procedures were not as 
naive as it appeared, nor were they always un- 
informed by intervening personality constructs. 

Holt pleads for the use of “sophisticated clini- 
cal methods,” by which he means something simi- 
lar to our analytic procedure, using intuition to 
make the final predictions. Among other things, 
he wants the clinicians to make preliminary stud- 
ies of the criterion behavior, in order to analyze 
the requirements for success. Holt does not take 
the step of requiring validation of the individual 
clinicians; but this is necessary to match fully the 
two sides in the contest. He reports that the best 
judges, using global clinical techniques, reached 
prediction validities of up to 0.57, whereas statis- 
tical treatment of the tests—regular Rorschach 
scoring (validated and cross-validated) and the 
Strong Interest Psychiatrist key—resulted in vir- 
tually zero validities. But Holt’s contest is unfair 
to the statistical side. His experiment was a half- 
hearted affair; no attempt was mzde to develop 
objective tests that would be appropriate to the 
selection problem at hand, as was done in the VA 
and the IPAR studies, and on Holt’s own admis- 
sion the Strong key was validated a long time 
previously in an entirely different situation. Holt’s 
report, as he himself indicates, does not provide 
us with a fair contest between sophisticated clini- 
cal and sophistical statistical approaches. 

In recommending a sophisticated clinical ap- 
proach, Holt argues that “there simply is no sub- 
stitute for empirical study of the actual associa- 
tion between a type of predictive data and the 
criterion” (15, p. 3). Despite this, the evidence 
that he presents on the value of objective criterion 
analysis for the assessor (i.e., validation) is not 
promising. The assessors at Menninger were pro- 
vided with “manuals” embodying validation mate- 
rial on the interview, TAT, Rorschach, and other 
assessment techniques that had been used in an 
earlier assessment of psychiatrists at Menninger. . 
Holt’s conclusion about their value reads as fol- 
lows: “Of the six, two proved worthless . . . 5 
the other four all showed more or less promise, 
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but there was none that yielded consistently sig- 
nificant validities regardless of who used it” (15, 
p. 8, italics ours). Evidently, the assessors would 
not, or could not, use the validation data which 
were provided for them. 

We are thus reminded once more that valida- 
tion includes validation of the specific assessors 
carrying out the specific assessment task. Nearly 
all reports of personality assessments offer evi- 
dence that the assessors differ considerably in their 
predictive skill. These differences are made up of 
two types of variation: variation due to differences 
in general ability to judge people (40), and inter- 
action effects between the assessor and the type of 
judgment called for (7). The reports on assess- 
ments offer the hint that the highest validities are 
achieved by assessors who have the most familiar- 
ity with the criterion situations and with the type 
of person who is successful in those situations; 
for example, in the CISSB assessments, the Board 
of Review consisting of experienced civil service 
administrators made more accurate predictions 
than did the original CISSB selection committee. 
In the former, the most valid predictions were 
made by the chairmen who were also civil service 
administrators. 

Accurate assessments are most likely to occur 
where the assessor uses the in-group stereotypes 
which are also held by the criterion raters; they 
are able to “play their predictions by ear” without 
any need to make the double inference involved 
in analytic techniques. In support of this method 
of predicting we can quote the comparatively 
high validities found for ratings of the “likeable- 
ness” of the candidates in the Michigan, Mennin- 
ger, and IPAR assessments. For example, in the 
latter, the assessors were mainly university profes- 
sors, and it is therefore not surprising that their 
ratings of “personal soundness” correlated as high 
as 0.52 with ratings made of the candidates on 
this quality by their own departmental professors 
(2). All other things being equal, the best as- 
sessors for predicting existing criteria are those 
who are partially contaminated with the same 
experience, standards, and outlook as the crite- 
rion raters and can thus rely on a global strategy 
to make their predictions. (The most accurate 
assessors are also more accurate than the most 
accurate, cross-validated, tests.) 

The validity of analytic methods is subject to 
the accuracy of the personality theory which the 
assessor uses, but psychologists usually possess 


fairly stable postulates, based on the lore of their 
discipline rather than behavior-oriented empirical 
research, and these are not readily changed in the 
light of actual empirical data. This is probably 
the reason why some of the Menninger assessors 
did not improve their accuracy with the help of 
the empirically-based “manual.” The difficulty 
can be seen clearly if we consider the findings of 
the Minnesota starvation studies (22) that the 
Rorschach indices of adjustment had a negative 
validity in predicting ratings of the subject’s ad- 
justment after starvation. Could a typical clinical 
psychologist bring himself to reverse completely 
his normal interpretation of the Rorschach in or- 
der to predict the subject’s adjustment under the 
criterion conditions? Not unless he were able 
to find an intervening variable between the Ror- 
schach and the criterion that would enable him 
to understand the connection within the frame- 
work of his existing theory of personality. 

Our discussion of clinical versus statistical meth- 
ods of assessment has concentrated on one aspect 
of the procedure, the prediction-making stage. 
The contrast between these two approaches can 
be made in connection with a whole chain of de- 
cisions that must be made in the course of assess- 
ment: these decisions include determining the ac- 
ceptable criteria, scoring the criterion behavior, 
conducting the criterion analysis, determining the 
form of the tests and standard situations, observ- 
ing and classifying the assessment behavior (i.e., 
scoring), combining the observations made by 
any single assessor into an assessment or predic- 
tion and combining the predictions made by dif- 
ferent assessors. For example, should the indi- 
vidual assessments be combined subjectively by 
the chairman of an assessment board, by voting, 
or by averaging the individual predictions? In- 
sufficient attention has been given to the relative 
merits of subjective and objective methods at each 
one of these stages. 

The choice of method will depend on both the 
requirements and the over-all situation, includ- 
ing, sometimes, public relations considerations. 
The final selection of assessment techniques is 
likely to be a mixture of both subjective and ob- 
jective, but the circumstances that will favor one 
or the other at any stage are rather vague, and the 
choice is usually made on subjective grounds, al- 
though it, too, could be made on the basis of 
objective, empirical investigation. In general, ob- 
jective methods are to be preferred as far as pos- 


94 CONTEMPORARY RESEARCH IN PERSONALITY 


sible as they maximize accuracy, but practical 
considerations of economy, convenience, and the 
limitations of the situation, dictate the wholesale 
use of subjective methods in personality assess- 
ment. These subjective methods may have high 
validity under favorable circumstances, and where 
the assessors are familiar with the criterion situa- 
tion, clinical judgments may actually be more ac- 
curate than any objective methods are ever likely 
to be in predicting to criteria. 

Conditional Variables in the Criterion.—An old 
problem in evaluating the validity of prediction is 
set by variations in the criterion situation attribut- 
able to the surrounding conditions. For example, 
a prediction that a candidate will make a good 
officer may be invalidated through some contin- 
gency such as being posted to a commanding offi- 
cer with whom he is incompatible. But these 
conditional factors do not stand on their own; 
there is an interaction between the person and the 
condition. Thus, Officer A may have the type of 
personality (or background) that makes it likely 
that he will be posted to a commanding officer 
with whom he will be incompatible; if, for in- 
stance, Candidate A is Jewish, he is more likely 
to have a CO who behaves uncongenially than is 
another candidate of a similar personality who is 
not Jewish. Further, Officer A may perform his 
duties better than otherwise when he has an un- 
congenial CO, while Officer B may perform his 
duties worse under the same circumstances. In 
most assessments, no specific reference is made 
to such conditional factors and there is an implicit 
assumption of “given normal conditions” attached 
to the predictions. The OSS reports a validity of 
only 0.19 for all cases from Station S compared 
with 0.39 for only the cases who were given as- 
signments that were consistent with the ones for 
which they were assessed. 

A further condition that is often ignored in as- 
sessment is that of effluxion of time; the predic- 
tions are usually made on the assumption that the 
status of the candidate on the relevant variables 
will remain constant over time. At a more sophis- 
ticated level, trends towards change may be ob- 
served in the candidate together with potential 
but as yet unrealized capacities, and the assess- 
ment may extrapolate these into the future. But 
it is virtually impossible to take into account sub- 
sequent learning, maturation and deterioration in 
the assessment prediction. 

In this connection, Cronbach and Gleser (6) 


have proposed a useful distinction between fixed 
treatment (the same conditions for all successful 
candidates) and adaptive treatments varying ac- 
cording to the candidate. Evidently the treat- 
ment of the OSS selectees was fixed rather than 
adaptive, and the predictions should have taken 
this into account. Five different types of solu- 
tions are suggested below for the problem of con- 
ditional factors in these treatments, Solutions 1 
and 3 being particularly appropriate to fixed treat- 
ments, and the other three to adaptive. (These 
represent an expansion of the three solutions pro- 
posed in Horst, 77, ch. 5.) 

1. Adjust the criterion ratings ex post facto ac- 
cording to the ease or difficulty presented to the 
candidate by the criterion conditions and the ef- 
fects of these conditions on him over the relevant 
period of time. This adjustment requires an in- 
tuitive judgment that takes into account the inter- 
action between the conditions and the candidate, 
and this can be done only by the rater making a 
further, independent assessment of the candidate. 
For the validation to carry conviction, it is neces- 
sary that the adjustment to the criterion rating be 
made independently of the assessment. 

2. Make the predictions to the ideal possible 
conditions so that they represent the candidate’s 
fullest potential; the criterion ratings can then be 
made in accordance with the same standards. In 
other words, both assessment and prediction at- 
tempt to hold conditions constant in the form 
that is considered to be optimal for the candidate’s 
performance. The actual conditions applying at 
the time of assessment and during the criterion 
performance are unlikely to be optimal no mat- 
ter how hard this state is sought, so that the use 
of this solution rests very heavily on intuition. 

3. Predict to the average or modal conditions 
that have prevailed in the past with respect to the 
criterion situation, or which are expected to pre- 
vail in the future. This is the usual orientation 
in assessments based on empirical validation since 
the correlations on which the validities are based 
are in effect averages. The empirical strategy 
automatically takes into account the variations in 
conditions as well as their average effect and 
maximizes the prediction to these average condi- 
tions. The conditions, thus, influence the assess- 
ment only through their effects on the criterion 
performance, without regard to their specific na- 
ture. It is practically impossible for a clinician 
to average all possible relevant conditions by an 
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intuitive act, although it is common for a clinician 
to bear in mind the modal conditions which can- 
didates face when these are prominent. 

4. The future conditions may be predicted spe- 
cifically for each candidate so that the interaction 
between the candidate and the criterion conditions 
may be anticipated in the assessment. The pre- 
diction to the future conditions may be made on 
the basis of inside knowledge of the treatment 
to be given to the candidates in the criterion situ- 
ation, or by forecasting on general grounds the 
specific changes that will occur in the conditions 
before the criterion ratings are made. Such pre- 
dictions must be intuitive rather than empirical, 
and, by the nature of the complexity of man’s en- 
vironment, all such intuitive predictions must fall 
well short of perfect validity. In some complex 
situations, in which the criterion performance is 
highly dependent on the conditions, the inability 
of the assessors to predict the specific conditions 
that will operate for any particular candidate may 
render the assessments completely invalid. 

5. The predictions themselves can be made in 
terms of specific conditions: “if X conditions oc- 
cur, then the candidate will be successful.” In 
this endeavor, the recent proposal by Cattell (3, 
pp. 426ff.) for a taxonomy of situations might 
eventually supply a list of standard situations to 
be considered in conditional predictions. 

Solutions 4 and 5 are both specific conditional 
solutions which can take’ into account the effects 
of conditions that are external to the candidate, 
as well as intrinsic conditions such as maturation. 
They require both a knowledge of the criteria re- 
quirements and a correct assessment of the can- 
didate, but the first type of conditional predic- 
tion emphasizes the criterion situation, and the 
second, the candidate. Both of these latter meth- 
ods of meeting the problem of conditions are 
adaptable to taking into account multiple condi- 
tions and also “adaptive treatments” such as pro- 
visions for training that are tailor-made for the 
candidate. They hold out the possibility of mak- 
ing more exact predictions than can be made by 
the other three attempted solutions to the prob- 
lem. This is one of the reasons why the global 
strategy, or slightly analytic versions of it, have 
been so often favored in selection assessment pro- 
grams. But these conditional predictions are also 
the most difficult to make, and only the best 
judges of personality or the ones who are most 
experienced with the criteria conditions are able 


to make them accurately, and then only in appro- 
priate situations. 

The decision as to the appropriate solution to 
the problem of varying conditions is closely re- 
lated to the choice of strategy. In the long run 
the choice is one between elegance and the prac- 
tical limitations that are imposed on the possi- 
bilities of accuracy. 

The Assumption of Safety in Numbers.—Per- 
sonality assessment programs rely on numbers to 
improve their validity in two directions: multiple 
tests and multiple assessments, We shall treat the 
evidence concerning these two points separately. 

Multiple tests—Where the tests and other as- 
sessment measures are combined objectively, for 
example, in accordance with a multiple regression 
equation, even the most valid test can usually be 
improved upon by adding one or two further 
measures to it. It is often striking, however, how 
quickly the multiple Rs reach their ceiling; the 
common components of almost all available per- 
sonality measures seem to be so high that we 
quickly exhaust the new elements that additional 
tests can bring to the predictions. The same ap- 
plies when the combining of elements is carried 
out intuitively, even though the clinician may be- 
lieve that the pieces of information about a can- 
didate are independent of each other. It is doubt- 
ful whether a clinician can use more than a few 
pieces of data that are relatively independent, 
even if they can be found. Sarbin (35), for in- 
stance, demonstrated that clinicians who were 
given a mass of data from which to predict the 
success of university students, gave most of the 
weight to two variables only. 

Evidently, to give a clinician more than two or 
three pieces of data about an assessee is likely 
to be of little value. Some critics go even fur- 
ther, claiming that giving extra data actually re- 
duces validities by confusing the allocation of sub- 
jective weights to the predictor variables, and by 
increasing the variability of the predictions, i.e., 
inducing the clinician to venture into making ex- 
treme judgments which increase the risk of mak- 
ing large errors. Kelly and Fiske claim (20) that 
in the Michigan study validities declined as more 
data were given to the assessors. Holt challenges 
the accuracy of their interpretation of the findings 
(15, p. 8), but even so there are other studies that 
suggest that more data do not always improve 
accuracy (e.g., 10, 11, 25, 38). In Giedt’s study, 
for instance, the clinicians were able to make more 
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valid predictions of mental patients’ personalities 
from sound recordings than from sound movies. 

But there are several studies affirming that, at 
least under some circumstances, more data do 
enable clinicians to improve their accuracy. We 
have already referred to Vernon’s report (44) that 
in the CISSB selections the Board of Review was 
able to improve on the assessment board’s recom- 
mendations by combining these with their own 
interview impressions of the candidate. Increased 
validity with increased data is also reported for 
the California (28), Chicago (39), and Mennin- 
ger (15) assessments. 

We must suspend our verdict on the value of 
multiple data at this stage. Evidently there are 
circumstances that can overcome the limitations 
on the ability of a single assessor to hold in mind 
data and to combine them. One suggestion worth 
testing is to combine data into subdecisions of an 
increasing degree of molarity, until the final molar 
decision is reached. This procedure can assist 
the clinician to consider all of the data in reaching 
his final decision and it is analogous to the use 
of structured schedules and rating forms that are 
used by interviewers to consolidate portions of 
the data as they go along. This technique as a 
general aid to clinical judgments seems worth ex- 
perimenting with, although the danger must be 
avoided of giving too much weight to the data 
that are presented first. In this respect it would 
seem to be wise to seek out first the data that are 
believed to be the most valid. 

Another way of handling the combining of data 
is to use several assessors, each responsible for 
one or two different techniques or areas of person- 
ality. This was the method adopted, for exam- 
ple, in the CISSB assessments. This proposal 
carries over to the general question of using mul- 
tiple assessors and we shall consider it further 
below. 

Multiple assessors.—The practice of using more 
than one assessor in selection work is an old one; 
the assumption has been that the more assessors 
there are, the more ideas will be thrown into the 
pool and therefore the more thorough will be 
the marshalling of data. Where ratings are 
pooled, it is also hoped that errors will cancel 
each other out. Very little experimental mate- 
rial is available on the relative value of group 
versus individual judgments in personality assess- 
ments, but evidence can be used from.other work 


on other types of group performance (see 18; 23, 
ch. 1; 1, ch. 5). 

These findings suggest, among other ‚things, that 
accuracy of judgments increases with the size 
of the group, but the optimum number in informal 
problem-solving groups is possibly five, since 
larger groups require formal structuring in order 
to ensure adequate communication of informa- 
tion; that compatible membership is important 
in problem-solving committees; that democratic 
groups produce more different ideas than indi- 
viduals but fewer per person; that the quality of 
group decisions increases with an increase in the 
skill of the members; that groups are quicker at 
solving problems than individuals, although less 
economical in terms of man-minutes. However, 
these findings vary according to the type of task 
concerned, and before we can carry them over to 
personality assessment it is necessary to bear the 
type of task in mind, 

Some of the questions that should be asked 
concerning group factors in personality assess- 
ment are: are group ratings more accurate than 
those of the individual members of the group; 
does group discussion by the assessors improve 
accuracy over pooled individual ratings; what, is 
the relative value of means, modes, and medians 
as methods of pooling; the ideal size of commit- 
tees; committee ratings versus averaging; authori- 
tarian leadership of assessment committees versus 
democratic; should all of the committee members 
be given the same data; should both the observa- 
tions and the interpretation be made by groups? 
These questions can be considered at three points 
in the assessment procedures: (a) in making sub- 
jective observations of the subjects; (5) in elicit- 
ing data from the subjects; and (c) in integrating 
and interpreting the data, and making the deci- 
sion. 

(a) At the observational level, we should ex- 
pect that the pooled ratings of several observers 
would be more accurate than individual ratings, 
since pooling reduces the error variance, provided 
always that the individual judgments have some 
validity in the first place (cf. 18, p. 739). 

(b) The value of the group interview versus 
individual interviews as a means of eliciting data 
is equivocal (see the discussion in Oldfield, 32). 
A recent study (72) on the selection of super- 
visors found that individual ratings based on group 
interviews by a panel of three were no more ac- 
curate than the ratings made by one interviewer 
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per candidate. While it is true that the group 
situation may elicit a wider sample of behavior 
than an individual interview, it is more difficult for 
an interviewer to evaluate the significance of the 
group as the stimulus to which the candidate is 
responding. However, if the interviewing is con- 
ducted by the chairmen only, while the other as- 
sessors are simply observers, this may enable the 
assessors to make more unbiased judgments than 
when they are actually involved in the interview- 
ing. This effect still remains to be tested empiri- 
cally. 

(c) The usual consideration of the value of 
group assessment versus individual deals with the 
integration of the available data. As in the case 
of group observations, pooled predictions are 
more accurate than most or all of the individual 
predictions (24, 26, 43). In one study (37) of 
assessing the qualities of a child on the basis of 
behavioral data, the accuracy increased with an 
increase up to 50 of the number of assessors 
whose ratings were pooled (there were only 50 
assessors available). 

Does discussion prior to assessment increase 
accuracy? The evidence on this suggests that it 
does not (34, 41, 21, 32). As Oldfield puts it: 
“Discussion of the merits of candidates merely 
amounts to a somewhat clumsy method of aver- 
aging the individual judgments of the members” 
(32, p. 129). Whether discussion aids accuracy 
or not appears to depend on the quality of the 
persons who dominate the discussion either 
through their position in the group, their person- 
ality, or their professional standing. Discussion 
is justified particularly when there is an “expert” 
as chairman, who will actually make the final de- 
cision in an autocratic manner, but who calls on 
the other members of the panel to give him the 
benefit of their opinions. An “expert” is defined, 
for this purpose, as a person who is experienced 
both in assessment and in the criterion situations. 

Kelly and Fiske are quite pessimistic regarding 
the use of multiple assessors. “Until some of the 
major sources of error in predictions are elimi- 
nated, the replications of assessors and the use of 
staff conferences hardly seems justified for this 
type of prediction” (21, p. 178). This conclu- 
sion is too sweeping. As we can see from Table 1, 
both pooled and committee (discussion) ratings 
have justified themselves in some studies. 

Let us conclude this section on the “safety in 


numbers” assumption with a proposal to combine 
the advantages of both multiple techniques and 
multiple assessors. The suggestion is that each 
assessor be given a limited amount of informa- 
tion on which to base his assessment judgments 
about the candidates, each assessor to receive dif- 
ferent information. The assessments will then 
be pooled arithmetically. The information sup- 
plied may be objective or subjective, atomistic or 
molar, and may range from one item of life- 
history, or a test result, to a projective test proto- 
col, an interview or the observation of behavior in 
a miniature situation. This procedure would en- 
able a vast amount of data to be integrated with- 
out problems of weighting since unit weights 
for each assessor’s contribution would be ade- 
quate—this would be analogous to an inventory 
that gives unit weight to each item. With ade- 
quate organization of the assessment program, 
this would permit several assessors to contribute 
to the final assessment so that different viewpoints 
and personality theories can be represented. This 
approach seems to be at least worth experiment- 
ing with. 

, Even if it is found that increased numbers of 
assessors increases perceptibly the accuracy of 
the assessments, there is still a fine calculus of 
cost in human time and effort to be computed. 
The decision to augment the panel with additional 
assessors is a function, among other things, of 
the gradient of diminishing returns, the ability of 
available extra assessors, the cost of using them, 
the effects on the candidates, and the desire to 
allow executives in the institution to participate 
in the assessment. The proposal made above of 
having many assessors, who contribute small 
pieces of information, may make it possible to 
conduct multiple assessments comparatively 
cheaply. 


SUMMARY 


Multiple personality assessment procedures 
have been analyzed with respect to their primary 
purpose and the validation strategy used. Prob- 
lems that arise in the attempt to use personality 
assessment for selection were discussed with re- 
spect to the problem of clinical versus statistical 
predictions, the problem of conditional factors 
that affect the criteria, and the value of using 
multiple tests and more than one assessor. 
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Some recommendations: 

1. Use objective techniques as far as possible 
for analyzing the criterion, for scoring tests, and 
for making predictions. 

2. Give careful consideration to requirements 
of the criterion and make empirical studies of 
the link between these requirements and both the 
test behavior and the criterion behavior. This is 
a step in construct validation. 

3. As a preliminary to the above type of cri- 
terion analysis, nonmediate empirical or clinical 
methods of prediction may be used. 

4. Subjective observations should be made by 
several observers whose opinions should be 
pooled arithmetically. 

5. The assessors should be selected for proven 
ability to make accurate judgments in the assess- 
ment situation, i.e., they should be validated, 

6. The assessors should be familiar with the 
criterion situation, and should take this situation 
into account when they make the predictions. 

7. Each assessor should be given no more than 
two or three units of information; there should 
be a large number of assessors whose predictions 
are pooled arithmetically, and without discussion. 

8. In selection assessments, if committee de- 
cisions are desired, the assessors who are par- 
ticularly well experienced in the criterion situa- 
tion should be given special influence in forming 
the final decisions, provided they have been 
shown to possess good ability to judge persons. 
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FOUNDATIONS OF PERSONALITY 
MEASUREMENT THEORY IN 
MULTIVARIATE EXPERIMENT * 


RAyMonD B. CATTELL 


Personality assessment has inspiration in many 
distinct fields, including such specialty areas as 
clinical, educational, counseling, and experimen- 
tal psychology. No matter where objective test- 
ing arises, it is a pointless procedure to make 
measurements and scales that are unrelated to 
meaningful personality structures. Consequently 
personality assessment and basic theoretical re- 
-search on personality become one and the same 
enterprise. 


STRATEGY OF PERSONALITY RESEARCH 


A brief digression on the strategy of personal- 
ity research is essential here to justify my later 
inferences. Since we nowadays rightly empha- 
size explicitness of methodology and models, may 
I remind you that though the hunches on per- 
sonality structure come from many fields of ob- 
servation, the basic scientific methods in psy- 
chology by which such hypotheses are tested, re- 
vised, and recreated, are really of two kinds only; 
namely, the univariate, controlled experiment, as 
borrowed from classical physics, and the multi- 
variate experiment, as illustrated in factor anal- 
ysis, the multiple discriminant function, canoni- 
cal correlations, etc. In the first approach, we 
try to hold constant everything but that which 
we are interested in manipulating and we then 
observe how one measure—the dependent varia- 
ble—changes with the changes we produce in the 
independent variables. In the multivariate ap- 
proach, on the other hand, we enter the experi- 
ment with a great number of variables, usually 
allowing them to vary as they vary in nature, 
without attempting to control them artificially in 
any way. We then tease out the relationships 
among them by the superior statistical potency 
of the methods which have been developed, prin- 
cipally in the life sciences, since the days of classi- 
cal physics. 

* Reprinted with permission from D. Van Nostrand 
Company, Inc. In Bernard M. Bass & Irwin A. Berg 


(Eds.), Objective Approaches to Personality Assess- 
ment, 1959, 42-65. 
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COMPARISON OF METHODS 


There are advantages and disadvantages to 
both methods, though you would sometimes think 
from the pious expressions of brass instrument 
psychologists that all scientific purity lies with 
the classical univariate method. Actually, the 
multivariate method can claim three great scien- 
tific advantages. First, it can deal with patterns 
and wholistic concepts. A clinician in my pres- 
ence once remarked to a psychologist that he 
proposed to do some experiments on the relation 
of the superego to school achievement. Where- 
upon the classical experimentalist, a man as direct 
as he was eminent, snorted, “What is a superego? 
I have never seen one.” 

The implications of this remark should really 
be a “plague to both their houses.” So long as 
the older type of experimenter deals only with 
single variables he must remain blind to any- 
thing that requires demonstration as a complex 
pattern. But equally the clinician, although 
taking many variables into account, is unable ob- 
jectively and scientifically to convince others, e.g., 
to show that the superego is not a myth but a 
visible pattern, unless he commands the powers 
of mathematical analysis to determine and demon- 
strate a loading pattern. The multivariate statis- 
tical methods possess this power, and, as I shall 
hope to point out in my summary, it is possible 
to define the superego, various drives, and a 
number of complex temperament patterns to a 
useful degree of exactitude by factor analytic 
means. Furthermore, when these constructs are 
measured as factors, they can enter into exact 
experiment as readily as any single concrete vari- 
able. 

The second advantage of the multivariate 
method is its sheer business efficiency. If you 
go to the labor of measuring, say, 200 variables, 
in a hundred pairs of two, on a large population, 
you get by classical experiment evidence on 100 
relationships. If, on the other hand, you do the 
same amount of experimental work and use a 
multivariate method of analysis, you throw light 
on the nature of approximately 2,000 relation- 
ships. That is to say, you possess the correla- 
tions in a matrix of 200 variables. Actually, this 
is something more than an enormous—twenty to 
one—gain in efficiency. 

When the 100 relationships of the univariate 
experimental design are taken from many differ- 
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ent samples, as commonly happens when an un- 
fortunate reviewer of an area in the Psychological 
Bulletin is trying to make sense out of a hundred 
independent researches, the findings “are essen- 
tially incomparable, statistically, because of com- 
ing from different samples, and they are of ques- 
tionable comparability experimentally, because 
they are always attained by the idiosyncrasies of 
the various investigators and their locations. The 
hypothesis-testing power, and especially the hy- 
pothesis-creating power of the multivariate ex- 
periment is here far greater, because we know 
that all the relations are comparable, having been 
made on the same group. 

Additionally the factor analytic method has 
special revolving powers, in terms of discerning 
meaningful patterns among the correlations. We 
get what a philosopher might call “emergents” 
from the accumulated, criss-crossing relationships, 
such as can never come from reasoning about 
single relationships or from the blind game of 
partialling this influence out from that, one at a 
time. In personality research it means that we 
are enabled to detect the major structures oper- 
ating across this whole field, whereas when one 
works with variables two or three at a time, try- 
ing to partial out this from that, one is apt to 
run around in the prison circle of one’s own 
feeble vision of possibilities. 

The third advantage of the multivariate method 
does not belong to its general use in psychology, 
but is specific to its application in the field of 
personality and clinical study. It resides in the 
fact that human beings decline to let you do 
controlled manipulative experiments on matters 
of vital emotional importance to them. If you 
wished to study the effect upon marital couples 
of a mother-in-law coming to stay in the home, 
you would not be well advised to go around 
issuing invitations to mothers-in-law, dropping in 
afterwards to see what this very independent 
variable has done to your dependent variables. 

There are two objections to the manipulative 
experimental design in the field of personality. 
The first is that you ought not to do it, and the 
second is that, if you throw ethics aside and 
proceed, the artificial insult of the experiment 
may create a situation quite different from the 
naturally occurring one. When you chop pieces 
off a man’s adrenal glands you do something 
more than reduce his adrenal functioning. The 
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multivariate experimenter, like the clinician, al- 
lows life itself to make the experiments, in nat- 
urally functioning organic wholes, and then ex- 
tracts the causal connections by superior statis- 
tical analytical procedures. If you stick to the 
controlled experiment in regard to emotional 
learning, etc., you are compelled like Mowrer, 
Miller and others to move increasingly away from 
human beings to animals. This leaves you with 
the impossible task of generalizing across from 
animal behavior to something vaguely analogous 
in human behavior. In fact, methodologically 
you have allowed the rat to lead you into a worse 
cul-de-sac than in any maze you ever constructed 
for him. 

If now you look at the so-called clinical method 
with this broad dichotomy of method in mind 
you will notice that it has several close parallels 
to the multivariate method. Both deal with 
major emotional events in the lives of human 
beings, allowing life itself to provide the source 
of manipulation, and both work upon wholistic 
perceptions of patterns and relations, rather than 
upon single variables. Indeed, I think it can be 
seen that there is really no such thing as a sepa- 
rate clinical method (unless we are talking about 
a therapeutic method), for, when stripped down 
to its essential formal procedures, the clinical 
method is the multivariate method. Unfortu- 
nately, though it is formally the multivariate 
method, it lacks scientific rigor, proceeding by 
intuition and fallible human memory, instead of 
being carried out on exact measurements by an 
electronic computer, using a far superior memory 
and a fully explicit statistical procedure. In 
terms of progress in the scientific study of per- 
sonality, the clinician has his heart in the right 
place, but perhaps we may say that he remains 
a little fuzzy in the head. The salvation of the 
clinical method lies in filling out its cloudy pro- 
cedures by structural statistics, decidedly more 
complex, incidentally, than those known to uni- 
variate methodology. Factor analysis is only one 
such statistical model, though it is the best we 
have achieved so far. 


MEASUREMENT FOLLOWS STRUCTURE 


But let us now return from this survey of foun- 
dations to my first assertion that measurement 
must follow structure. I am aware that this 
reiteration of “no testing without structure” 


makes me as popular among certain kinds of test 
constructors in educational and clinical psychol- 
ogy as a Baptist minister reminding people of the 
Ten Commandments in an establishment for or- 
ganized vice. But I would repeat that you may 
use the most impressive scaling procedures, refin- 
ing Guttman, Coombs, and others to the n*® 
power, and still be merely engaged in a sort of 
psychometric chess game, as far as any psycho- 
logical understanding of psychological problems 
is concerned. If your scale is not guaranteed to 
deal with something psychologically meaningful 
and organic, it cannot help in psychological pro- 
cedures. And, incidentally, it does not seem to 
be sufficiently realized that a Guttman scale, or 
any other scaling method per se, does not guar- 
antee a factor pure scale. A correctly scaled 
scale may still be of any degree of factorial con- 
fusion. 

When I mention a demonstrable functional 
unity in what follows, I refer technically not 
only to a pattern of covarying parts which can 
be demonstrated as a unique, replicable, deter- 
minate factor in terms of factor analysis, but also 
to a pattern which additionally could be shown 
to function as a whole by univariate, controlled 
experiment. That is to say, the pattern should 
show itself not only by a person who is higher 
in one element of it consistently being higher in 
the other elements, but also by the parts varying 
together from occasion to occasion when an ex- 
perimental influence which changes this trait is 
brought to bear. 

Within multivariate methods, this means that 
the factor pattern out of which the construct or 
concept arises, must be demonstrated not only by 
the classical R-technique, but also by the longi- 
tudinal P-technique. It may also be demon- 
strated by other factor analytic experimental de- 
signs, such as the condition-response design, in 
which one simultaneously factors in a single 
matrix both the various stimuli that might cause 
the pattern to change in level, and all the mani- 
festations by which the pattern is recognized. 
In short, to ensure that a unitary trait is sound 
in wind and limb, it should be thumped in many 
different parts. Thus, in Scheier’s work on the 
nature of anxiety, which has come out with cer- 
tain clean cut results which I shall discuss in a 
moment, it was first demonstrated that some ten 
psychological and physiological variables repeat- 
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edly emerged as salients in a single factor in 
studies dealing with individual differences in 
anxiety level, i.e., by R-technique. 

After this R-technique demonstration of the 
boundaries of anxiety as an individual difference 
trait, a longitudinal study was made in which the 
fluctuations (in these salient variables discovered 
on the R-technique factor) were measured from 
day to day under the various naturally occurring 
anxiety stimuli of daily life. A longitudinal fac- 
tor analysis, by P-technique, then turned out much 
the same factor pattern as had been obtained by 
R-technique. There were some differences of 
emphasis, but it was clearly the same anxiety fac- 
tor, marked by the same major variables. 

A third phase of the research consisted in meas- 
uring a large number of people on this array of 
variables and then submitting them, in an analysis 
combining the factorial design of analysis of vari- 
ance with factor analysis, to a number of what 
are commonly considered anxiety provoking stim- 
uli, such as important examinations, a discussion 
of imaginary diseases, some probing of their eco- 
nomic condition, etc. Correlating in the stimu- 
lus differences with all the response differences, 
resulted in the reappearance of the same anxiety 
factor pattern. In this condition-response design, 
however, it was additionally loaded with the stim- 
uli which are effective in producing the anxiety 
response pattern. Scheier’s work on the measure- 
ment of anxiety thus illustrates the full present 
scope of multivariate method usage and shows 
how a practical measurement of high validity and 
determinateness can result. 

This digression on complex issues of method 
may have been so brief as to evoke the comment 
that for those who knew them already, it was un- 
necessary and for those who did not, it was too 
short to carry the full implications. But we must 
move on with the statement that if this agreement 
is fully examined, it provides justification for be- 
lieving factor analytic findings rather than clinical 
impressions. It also prevents our aligning our- 
selves, on the other hand, with that compulsively 
accurate psychometrics of scales which still nar- 
rowly persists in the old faculty psychology of 
supposing that where there is a single name there 
must be a single function. 


A BRIEF REVIEW OF FACTOR ANALYTIC FINDINGS 


Although factor analytic findings over the last 
fifteen years have been evaluated elsewhere (4), 
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it will be helpful here to give a brief sketch of 
the substantive findings which are the necessary 
basis for the measurement theory I have to dis- 
cuss. These results in 1958 are largely the out- 
come of certain aims and canons of research and 
method worked out in a first attempt to integrate 
the field, namely, my Description and Measure- 
ment of Personality (2) twelve years ago. In the 
first place, our laboratory has always aimed to 
gather data widely and simultaneously over the 
three chief possible media of personality observa- 
tion—L-data or life records of behavior in situ, 
Q or questionnaire data, and T or objective test 
data. In the life record medium, personality is 
observed in the natural life situation, by time 
sampling, rating, or keeping of records on par- 
ticular events, e.g., achievements, accidents, etc. 
This is, of course, the criterion medium, in the 
sense of being an external or cultural criterion for 
any testing. In the questionnaire medium the 
person responds by giving his impressions of him- 
self, limited by his own self-knowledge and will- 
ingness to disclose. In the objective test me- 
dium, there is no question of introspective self- 
evaluation as in the questionnaire, but only of 
actual performance or response in a miniature 
situation, in which the subject does not know what 


1Some definition of “objective” is required, since 
there are commonly two degrees of objectivity in test 
construction: (1) objectivity of scoring, plus, (2) 
objectivity in the sense of not involving self appraisal. 
It is in the latter, complete, sense that “objective” is 
used here, and we would suggest the term conspective 
for a test that is only objective as to its scoring. This 
implies that it has a high conspect reliability, that is 
to say, complete agreement between two different 
psychometricists as to what the score would be. A 
test which is not conspective might be called a rative 
test, indicating that the score depends on the judg- 
ment of a single person and is made by rating rather 
than key scoring methods. The reader will notice 
that Edwards (8) in his contribution has used the 
term objective in a way which allows both the true 
objective test as defined here and the conspective test 
to come under the same heading. However, psycho- 
logically and in terms of the history of test develop- 
ment, the difference between the conspective test and 
the true objective test is greater than that between the 
rative test and the conspective test, for it deals with 
the whole test character, whereas the latter deals 
only with the mode of scoring. For these adequate 
reasons, and to agree with the usage systematically 
adopted throughout 15 years of publication from our 
laboratory, we have used objective for the non-self 
appraisal type of test. 
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aspects of his performance are being scored and 
interpreted. 

At each age level at which investigations have 
been made we have always begun by operating 
in the first of these media, because it connects 
most readily with existing concepts in the field, 
permitting interpretations of the factors in popu- 
lar, clinical, and general terms. An immediate 
integration of concepts is then possible because 
the measurements use the same words and situa- 
tions of behavior as are covered by the clinicians, 
guidance psychologists, educators, and others. It 
also has the advantage that it permits use of a 
personality sphere concept, that is to say, the 
notion of a stratified sample of variables from the 
total realm of behavior. Incidentally, for those 
interested in perfecting the factor analytic ap- 
proach, this ability to introduce a concept of a 
population of variables becomes very important. 


CANONS OF PROCEDURE 


In adopting this simultaneous, three-fold obser- 
vation of personality, it was our conviction that 
any important dimension of personality should 
break through all three, showing itself at once as 
a factor pattern in behavior rating data, in the 
questionnaire-response patterns, and in objective 
personality tests. This theory has been only par- 
tially vindicated, and some tantalizing exceptions 
persist. Throughout this discovery of structure 
as a basis for measurement it has been a canon 
of research procedure that factors shall be deter- 
mined by simple structure principles, and other 
principles permitting a unique, objective factor 
solution quite independent of any psychological 
pre-conceptions which the observer may have 
about personality structure. Psychologists who 
have been using factor analysis by rotating for 
“psychological meaning” are merely having a 
pleasant game perpetuating their own superstitions 
or prejudices. 

Some years ago I talked at Tubingen with the 
German psychologist, Kretschmer, who has done 
such striking clinical experimental work in bring- 
ing out the full nature of the schizothyme .tem- 
perament pattern. Whenever I showed him an 
experimental pattern in factor analysis that agreed 
with his clinical impressions, he would say “factor 
analysis is a remarkably important scientific tool.” 
But when I showed a pattern that corresponded 
to no known clinical pattern, his inclination was 
at once to assume it to be an artifact, and immedi- 


ately to lose interest in it. This I cite only as a 
rather amusing and well developed instance of the 
attitude that, together with some defects of sta- 
tistical education, has kept the clinical psycholo- 
gist from understanding the importance of these 
factored measurement developments for his work. 

Indeed, there is no need especially to pillory 
clinical psychologists, for psychologists in general 
seem rather prone, relative to physical scientists, 
to dependence on subjective conviction. Yet if 
we are dealing with a science rather than a re- 
ligion, we should welcome objective methods 
which surprise us by turning up something that 
does not in the least fit what we knew before. 
Factor analysis has, in fact, produced surprises in 
the clinical field, for those who can see them, 
much as the microscope did in biology. Notably 
it has turned up at least a dozen clear cut patterns 
in the personality field, that contribute as much to 
the variance of behavior as any such familiar 
concepts as schizothymia, ego strength, domi- 
nance, etc., which have nevertheless never been 
visible to the naked eye of the clinician or named 
or discussed. These structures have not yet been 
accepted as the challenge to existing clinical the- 
ory and formulations that they really are, for they 
have power to yield predictions of criterion be- 
havior impossible from the familiar concepts. 

A third canon of our research has been that the 
factor patterns shall be replicated, in at least two 
independent researches, before we begin to give 
them serious theoretical consideration. To put 
this canon into effect requires considerable plan- 
ning in research, to ensure that a sufficiency of 
identical variables to permit matching are carried 
over from one study to another. For in this day 
and age we can no longer go along with the idea 
that the identity of a factor in one study with 
that in another study can be established merely 
by the psychologist’s impression of the psycho- 
logical similarity of the two. There must be accu- 
rate carrying over of salients, and the use of a 
quantitative index, such as the salient variable 
similarity index, to ensure that the patterns really 
are alike. 

A fourth principle has been that we should not 
be too hasty in interpreting the factors, but should 
be content to designate them by an index number 
in some agreed universal index among psycholo- 
gists, such as that which I have proposed as an 
international index in the current issue of the 
Japanese International Journal of Psychology. A 
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factor will commonly become a recognized part 
of the scenery, and a basis for measurement in 
a good unifactor scale, some years before its 
nature is fully understood. In the case of about 
half a dozen of the discovered factors, namely, 
ego strength, intelligence, anxiety, general neu- 
roticism, schizothymia, and superego strength, 
I think the pattern is sufficiently identical with 
anything that has ever been called by that name 
by a responsible psychologist, to justify using these 
customary names and interpretations—such as 
they are. In about another half dozen factors a 
pretty definite idea can be formed of the physio- 
logical, experiential, or dynamic influence respon- 
sible for the pattern.‘ For example, surgency- 
desurgency level is essentially the level of general 
inhibition, and seems correlated with frequency 
of past punishment. The factor we have called 
Q3 seems to represent the degree of dynamic in- 
vestment in the self sentiment, and so on. While 
these explanations are not as perfect and perhaps 
not as lurid as those that psychoanalysts are fed 
with their Freudian mother’s milk, they have the 
advantage of dealing with demonstrable behavior 
patterns, and of permitting measurements of indi- 
vidual differences, with known validity and relia- 
bility, which can be made the basis of experimen- 
tal investigation of theory. Surely it is high time 
that theory began to build itself around these 
measurable behavior patterns, frequently repli- 
cated in eight to a dozen researches, instead of the 
vaguely perceived and statistically unsubstantiated 
behavior patterns and sequences which many clin- 
icians take as the basis for elaborate theories. 

A last canon of design, in these experiments to 
put personality measurement on a functional basis, 
and functional concepts on a measurement basis, 
is that continuity should be established in the 
patterns over the whole developmental age range. 
That is to say, not only should the functional unity 
be established at one age level, by the above two 
handed use of R- and P-techniques, but the age 
range should be cut by such studies at three- or 
four-year intervals, to establish the mode of 
growth, as one might take slices across the stem 
of a plant. This is a big order, and it has not 
yet been filled, but sections have recently been 
taken at 12, 8 and 4 years of age and are in press. 


LONGITUDINAL ANALYSES 


The hypotheses of measurement here are that 
some patterns might be expected to persist over 


all age sections more persistently than others. For 
example, an ability like general intelligence, or a 
temperament trait associated firmly with some 
physiological or body-build component, would be 
expected to show itself, perhaps with some modi- 
fications, from the earliest testing period. On the 
other hand, an environmental mold pattern, such 
as the superego or a sentiment to a specific object, 
might be expected to appear only at a given age 
and to show more pronounced developmental 
change in the loading pattern. The work on per- 
sonality factors was initially done, for good rea- 
sons, at the young adult level, but the researches 
of Coan, Peterson, Gruen, and others, maintain- 
ing the combinations of life record data, ques- 
tionnaire data, and objective test data, show that 
all but three or four of the factors established in 
the adult level can be traced down through child- 
hood-and even into infancy. For example, in the 
factor analyses of time samplings of behavior, 
made at the four-year-old level, we can clearly see 
the cyclothyme-schizothyme factor, the domi- 
nance-submissiveness factor, the surgency-desur- 
gency factor, the paranoid factor, the ego strength 
factor, and so on, operating in the nursery school 
world. 

On the practical side, for the benefit of those 
who wish to do longitudinal research in personal- 
ity over a sufficient interval, we are in process of 
constructing measures of these factors in the ques- 
tionnaire medium. Thus, 14 of the 16 factors in 
the adult 16 Personality Factor Questionnaire (3) 
can be demonstrated and set up in the range from 
12 to 16 years of age, in a test called the High 
School Personality Questionnaire (5). Twelve of 
the factors can still be clearly recognized at the 
nine-year level and are being put into the Child 
Personality Questionnaire. Peterson has made 
sets of questions, which must, of course, be given 
orally, which get at these personality factors at 
the nursery school level. There are many tech- 
nical difficulties in getting a good series of person- 
ality factor questionnaires to operate meaningfully 
from the four-year level right up through the adult 
level, but these difficulties must be overcome, be- 
cause such longitudinal studies are essential both 
for understanding personality and for the success 
of applied psychology. 

Indeed, there is both a great need and great 
opportunity at the moment for Jongitudinal stud- 
ies in personality structure. Such studies ‘will, 
first, establish more definitely the identity of the 
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factors found at the earlier level with those at the 
later level, by repetitive measures on the same 
children at intervals; second, show which factors 
are most subject to environmental influences, and 
if so, to what environmental influences; third, 
show the general curve of change in these person- 
ality factors in the same sense as we have estab- 
lished the normal trend in the intelligence factor; 
and last, suggest in what way the pattern of be- 
havior typically changes with age. 

In regard to these changes in the test weights 
in the pattern to be measured, we may instance 
that the ego strength factor in four-year-old chil- 
dren loads freedom from temper tantrums, free- 
dom from enuresis, infrequency of headaches and 
psychosomatic disorders, infrequency of manifes- 
tations of jealousy, etc. By eleven years of age, 
enuresis has dropped out of the loading pattern, 
the main emotional stability versus instability vari- 
ables remain, and some new elements have come 
in. Similarly, in the dominance factor, disobedi- 
ence, sulking and “talking back” are prominent 
in the time sampling variables at the early age, 
whereas by the adult level, these are no longer 
present. The disobedience has become unconven- 
tionality, but the talkativeness has disappeared and 
the dominant adult is, if anything, rather more 
silent than the average. 


VALIDITY OF PERSONALITY MEASUREMENTS 


In discussions on the validity of personality 
measurements, it is desirable constantly to distin- 
guish between concept or construct validity, on 
the one hand, and external, or cultural, validity 
on the other. The former is defined by the corre- 
lation between a given objective test questionnaire 
or rating scale and the factor as derived from all 
known criterion variables. Thus we might vali- 
date a test against such factor constructs as anxi- 
ety, or against ego strength, or against surgency, 
or against schizothymia. The external or cultural 
validity is never validity singular but validity 
plural. That is to say, there are thousands of 
things against which a factor’s predictive power 
could be tried and the correlations known in the 
interests of interpretation; but no one of them is 
the criterion. For instance, the use of the 16 PF 
test has yielded a great many significant personal- 
ity factor correlations, for example, with success 
in school, prognostic rating in a clinic, automobile 
accident proneness, alcoholism, etc., and these 


have greatly enriched the original interpretations 
based on factor content alone. 

One of the first inquiries to be made about the 
nature of a factor—indeed it should be the rou- 
tine inquiry before making any more specific 
hypotheses—is to test whether it is largely heredi- 
tarily determined or substantially a product of 
learning and environment. Obviously, this is of 
basic importance both for theory and for the 
proper practical use of the measurement. Indeed, 
one of the chief claims of the factorially unitary 
measurement is that it permits something more 
than merely statistical prediction—namely, an 
estimate of criterion performance that takes into 
account whatever general psychological knowledge 
about the natural history of a trait permits us 
additionally to infer. Fortunately, some fairly 
extensive nature-nurture studies have already 
placed the principal factors in perspective, in rela- 
tion to such older factors as Spearman’s “g.” For 
example, we know from multiple variance analysis 
studies that the cyclothyme-schizothyme factor is 
largely hereditarily determined, that a surgency- 
desurgency source trait is largely environmentally 
determined, that the level of dominance-submis- 
sion is about 50-50 a product of constitution and 
familial-environmental influences, and so on. 

Although the greater meaningfulness of person- 
ality measures based on factors arises from the 
possibility of building around each of these func- 
tional unities a rich natural history, the actual 
growth of such knowledge has barely begun, be- 
cause of the extreme recency of satisfactory proof 
of the factors themselves. Meanwhile, the tests 
that can be and have been spawned with much 
greater ease have accumulated gargantuan stand- 
ardizations, as well as the momentum of enormous 
numbers of past students whose gifts seem to be 
exclusively in the rituals of administering them. 
Like the first small mammals entering a world 
possessed by the dinosaurs, the lately arriving fac- 
tored tests, have a validation largely in the future. 


DYNAMIC CALCULATION 


The most recent, and as yet scarcely noticed, 
development of factored measures lies in that area 
of dynamic calculation which is so vital to clinical 
psychology and to motivation theory. This rests 
on the discovery that the drive patterns in man 
can be established by the factoring of collections 
of objective motivational measures. Sex, self as- 
sertion, fear, and six other drive patterns have 
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been replicated now in three successive studies by 
these means. Alongside these easily recognizable 
drive patterns, there occur patterns that closer 
scrutiny suggests can only be acquired dynamic 
sentiments, such as the self sentiment, the senti- 
ment to religion, and the sentiment pattern of atti- 
tudes and interests acquired about one’s pro- 
fession. 

In these studies an attitude is defined as a stim- 
ulus-response habit. By this model the strength 
of an interest, that is, the need to react, in regard 
to any course of action, can be expressed by a 
specification equation, weighting the tension levels 
of the various drives and sentiment systems of the 
strengths measured in the given individual. I 
must refer you to my recent book on Personality 
and Motivation Structure and Measurement for 
the postulates, and the chief equations, utilizable 
in the dynamic calculus which develops on this 
basis. The non-arbitrariness of the drive pattern 
makes possible unambiguous measurements. For 
example, it is found that the achievement mo- 
tive can be resolved into three distinctive com- 
ponents, a drive and two sentiment structures. 
On the basis of such unambiguous, functionally 
distinct and replicatable measurements, the exper- 
imental investigation of dynamic laws and moti- 
vational theories can go forward more exactly and 
more subtly than before, 


FACTORING OF DRIVES 


A development of crucial importance for clin- 
ical theories which these measurements have made 
possible, is the factoring of drives to determine, 
by P-technique, their quantitative ‘contribution to 
the interests, attitudes, symptoms, and conflicts, 
conscious and unconscious, of the individual clin- 
ical case. Such a study is now being conducted 
by Williams, factoring each of 24 patients, to see 
the degree of agreement between the statement of 
each individual conflict based on the psychiatrist’s 
experience with the case, and the description of 
the conflict quantitatively in terms of the dynamic 
calculus. If the agreement is reasonably good, I 
think we shall have demonstrated a very powerful 
new clinical tool. Parenthetically, I may add that 
to the extent that the agreement turns out to 
be imperfect, one may reasonably have doubts 
whether the psychiatrist or the dynamic calculus 
is wrong. In fact, our first step if the agreement 
is inadequate will be to bring in a second psychi- 
atrist to see how far he agrees with the first! 
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NEW TYPES OF TESTS 


The development of structured measurement in 
motivation has gone hand in hand with the inven- 
tion of quite new types of objective tests, no longer 
requiring the actual scores on component interests 
and attitudes to rest on the verbal opinionnaire or 
the open-ended, projective type of test. These 
objective motivation measures include some de- 
vices using the so-called projective principles, to- 
gether with physiological measures of motivation, 
learning measures, and many others. As far as 
theory is concerned, the interesting point of this 
analysis is that we seemed to get three distinct 
motivation strength factors, apparently corre- 
sponding to the id, ego, and superego contribu- 
tions in any given interest. 


PRACTICAL IMPLICATIONS 


These theoretical implications will doubtless be 
much scrutinized and debated, but there are some 
immediately dependable conclusions for the prac- 
tical man. First, the classical opinionnaire method 
of measuring attitude-interest strength by verbal 
self-evaluation has quite poor validity, accounting 
for only about a fifth to a tenth of the variance 
in the main motivation factors, however they are 
interpreted. Consequently, generalizations about 
attitudes and interests based only on this instru- 
ment could be highly fallacious as far as the total 
variance in interest strengths is concerned. Sec- 
ondly, the projective tests, or misperception tests 
as we prefer to call them, are not clearly distin- 
guished by any factors from the rest of the motiva- 
tion measurement devices. Thus in the theoreti- 
cal reconstruction suggested by this work, the 
classification of motivation measurements would 
fall principally into id, ego, and superego compo- 
nent measures, and the division into projective 
and nonprojective, physiological and non-physio- 
logical, etc., signs of motivation strength become 
rather pointless. 


MEASUREMENT OF STATES AND TRAITS 


Any comprehensive view of progress in person- 
ality assessment must include the measurement of 
states as well as the measurement of traits. The 
work of Scheier (77) on the measurement of anx- 
iety provides, as I have briefly mentioned, a very 
neat methodological demonstration of conceptual 
and statistical problems involved in separating 
states and traits. Scheier has now checked the 
anxiety state pattern in two independent factor 
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analytic studies, and the anxiety trait pattern in 
no fewer than eight independent researches. 
There is enough similarity in the state and trait 
patterns to justify the popular habit of using the 
term anxiety for both. Both load a particular set 
of markers in the questionnaire realm, in objec- 
tive personality tests, and in physiological response 
measures, though the emphases are interestingly 
somewhat different. For aught the early scale 
makers knew, there might have been four or five 
distinct and uncorrelated factors of anxiety rather 
than a single factor. 

As it turns out there does seem to be a single 
factor of anxiety, but these premature scales mix 
this anxiety factor with the quite distinct neu- 
roticism factor and a number of other irrelevant 
and contaminating factors. It is really not sur- 
prising that anyone surveying the literature of the 
past ten years is discouraged by almost every find- 
ing being matchable by an equal and opposite 
finding for even when investigators verbally de- 
fined anxiety in the same way, they frequently 
used a different test for it. It is rather early to 
see what the full impact of factor analytic work 
on anxiety measurement will be in giving a new 
momentum to insightful clinical research. The 
instrument could permit the emergence of a whole 
series of new laws and therapeutic certainties, re- 
placing the present gropings toward scales of ob- 
scure meaning. 

One of the certainties which emerged relatively 
early from this joint attack by Eysenck’s labora- 
tory and our own, was that anxiety and neuroti- 
cism are distinct factors. They have a slight 
obliquity, but, as I have shown with substantiat- 
ing evidence, they can be measured with satisfac- 
tory reliability by objective tests, and, when so 
measured, it becomes evident that a person can 
stand at any position on one axis while occupying 
any position on the other. If the further, but far 
more tentative evidence by Eysenck’s coworkers 
and ourselves, of a general psychoticism dimen- 
sion is sound, then both anxiety and neuroticism 
are, additionally, independent of measured psy- 
choticism. 

These, however, are only local areas of illumi- 
nation in the factor analytic picture, rendered 
clearer by our clinical familiarity with the phe- 
nomena. Outside these brightly lit spots, in the 
domains of the remaining dozen or more person- 
ality factors, definitely jocatable but uninterpreted, 
there exists obscurities and some intriguing para- 


doxes, now engrossing the pure researcher. For 
example, for the last four years it has been known 
that two substantial second order factors can be 
found among the primary personality factors as 
represented in the 16 Personality Factor Ques- 
tionnaire. 

The first of these second order factors brings 
together the separate dimensions of ego weakness, 
high ergic tension, and the mysterious O factor, 
sometimes called guilt proneness, and which we 
have so far hung on to mainly by the symbol O. 
The second of these massive second-order factors 
reveals the existence of a common influence be- 
hind surgency, cyclothymia, dominance, and the 
factor which we have called parmia, which is 
short for high parasympathetic system dominance. 
These patterns were confirmed by the independent 
study of Karson, at the University of New Hamp- 
shire, and I think that we can now agree that the 
second of these two large factors gives substance 
to the Jungian concept of extraversion-introversion 
as a definite, invariant second order factor, rather 
than as the mere correlation cluster which it was 
once thought to be. That is to say, the general 
personality dimension of extraversion really ex- 
presses itself in five relatively independent primary 
factors: surgency, dominance, parmia, cyclothy- 
mia, and lack of self-sufficiency. The quality of 
an individual’s extraversion therefore needs to be 
defined by his separate scores on these five com- 
ponents. 

Although the second massive second order fac- 
tor thus quickly fitted into a concept long popu- 
larly discussed, the first large pattern involving 
ego weakness, ergic tension, etc., as I have de- 
scribed, could not be immediately interpreted. 
However, when Scheier began his work with ob- 
jective anxiety measurements, he included the 16 
Personality Factor Questionnaire in his study and 
when he determined the loading of his objective 
test anxiety factor on these questionnaire meas- 
urements, the pattern of loadings turned out to 
be exactly the same as that found in the factoriza- 
tion of the questionnaire itself. In other words, 
the second order factor among the questionnaires 
is identical with the first order factor among the 
objective test measurements. On looking at the 
psychological picture this begins to make good 
sense for it tells us that high anxiety is contributed 
to by ego weakness, by high ergic tension, that 
is to say, frustration of drive expression, and by 
the temperamental guilt proneness component. 
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Without time for expanding these comments, I 
would point out that we now have three instances 
where a second order factor in the questionnaire 
realm has become recognized and confirmed as a 
first order factor in objective, instrumental tests. 
The perplexing lack of relationship between the 
questionnaire and behavior rating factors on the 
one hand, which mutually agree well, and the 
objective test factors on the other, which have 
previousiy defied alignment, therefore begins to 
resolve itself. The objective test factors are sec- 
ond order factors to the primaries found in the 
other media of observation. This is only one 
illustration of the increasing interconnection and 
illumination of structure which is now beginning 
to take place in the factor analytic realm. How- 
ever, I want to add a technical word of warning. 
I think this fitting together of the jigsaw puzzle 
can continue only insofar as we all give far more 
attention to good technical precision in our first 
order factor analyses than has been typical of 
work in the last ten years. In particular, far 
greater diligence is necessary in getting accurate 
rotation, to a plateau of maximum percentage of 
variables in the hyperplane, whenever simple 
structure is alleged to be obtained. 


AVOIDANCE OF GENERAL THEORY 


If I have referred insufficiently to general the- 
ories, it is because I believe that psychology par- 
ticularly needs to guard itself at this stage of 
development from getting into cloudy regions of 
grandiose theory, instead of seeking well estab- 
lished laws and concepts, susceptible to accurate 
measurement. In a healthy science, wider theories 
arise from well determined regularities, which we 
call laws. If you give yourselves a few seconds 
thought on this, I think you will realize that the 
unquestionable, dependable laws in psychology 
can probably be counted on your fingers, not in 
the hundreds with which they can be counted in 
the physical sciences. This is both a cause and 
a consequence of the psychologist’s readiness to 
escape, at the drop of a hat, into philosophy. The 
despair which motivates this escapism is a justifi- 
able reactive depression to the very small amount 
of progress made in psychology relative to the 
enormous amount of labor that has gone into 
research over the past thirty years, especially in 
the clinical and personality area. 

Of course, if we get too dissatisfied with our- 


selves, in relation to the chemists and physicists’ 
accomplishments, we can always look across at 
the still more chaotic and barren backyard of our 
neighbors, the sociologists. We are not quite at 
the head of the list in terms of getting nowhere in 
a great hurry. Several shrewd observers have 
pointed out one feature that constantly seems to 
distinguish research in the social sciences from 
research in the physical sciences: the physical sci- 
ences typically show an architectonic growth, in 
terms of one research building constructively upon 
another, whereas in the social sciences there are 
an enormous number of unrepeated researches, in 
which particular variables are used by a particular 
investigator and never touched again by anyone. 
The resulting scenery is a shanty town of one 
story hovels instead of the skyscrapers which the 
physical sciences build. 

I think there are three major, and doubtless 
many minor, reasons for this. First, we have 
tried to ape the physical sciences by concentrating 
on the univariate controlled experimental method, 
instead of the multivariate experimental method 
which is alone truly adapted to the far more nu- 
merous variables and complex determination with 
which we deal. Second, our work needs far more 
mathematical discipline than our students have 
been willing to acquire. Third, there has been 
insufficient social organization of research. By 
this last I mean that we have been inclined to 
ascribe our failures wholly to defective technical 
methods when frequently they are due to defective 
coordination of research. Better social organiza- 
tion can come either from the organization of 
teams and institutions or through more sensitive 
conscience and vision in the individual research 
worker. One of the immature features of our 
science seems to be a bizarre teenager sense of 
honor, which dictates that no individual with 
claims to creativity could possibly use the same 
variables as any other individual and certainly not 
stop to replicate any extensive experiments previ- 
ously done. There is also what I would call mag- 
pie research in which the investigator seems at- 
tracted for purely emotional reasons by the glitter 
of a particular variable or piece of apparatus, e.g., 
the psychogalvanometer, social prejudice, colored 
inkblots, sociometric count, or what have you, 
and centers his research on a mere variable with- 
out any broader theoretical or conceptual frame- 
work, 
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SOLUTION: ORGANIZATION AND DIVISION OF LABOR 


I believe a great deal of progress could be made 
by a very simple practice indeed, namely, that of 
putting other peoples’ variables into one’s corre- 
lation matrix. People can continue to hold quite 
different theories about what is happening behind 
these variables, but at least if we linked hands on 
some marker variables we could with comparative 
certainty begin to relate and debate the theories 
through the intercorrelation matrices. We do this 
as a routine procedure in our own factor analyses, 
taking a minimum of two marker variables for 
each well known factor, for example, from the 
work of Eysenck, Guilford, and other previous 
researches, when we start new investigation on the 
next factor of theoretical interest. Factor analyses 
carried out without such markers from the known 
terra firma are strictly uninterpretable. They in- 
habit a solipsistic universe of their own, with no 
past and very little future, and might as well be 
carried out upon the moon. On the other hand, 


an overlap of variables must sooner or later mean | 


an overlap of integration of ideas. If people want 
to be productive, they should get their variables 
together. ; 

This brings me to consider an important respect 
in which the development of our own personality 
assessment researches may be considered to lack 
integration. The charge must be admitted that 
factor analysts are so engrossed in establishing 
the form and nature of factors, with statistical ele- 
gance, in laboratory measures, that they have 
made quite inadequate effort to show the clinician, 
the educator, the industrial psychologist, and 
others what these factors mean in more popular 
terms, and particularly to interpret them in terms 
with which the general psychological theorist is 
familiar. But let us not mistake the principle of 
division of labor, which is necessary in a highly 
specialized world, for any lack of integration, 
which is not, It happens that a rather unusual 
assemblage of skills, apparatus, and organized 
facilities is necessary for the effective advance of 
knowledge through applying factor analysis to 
establishing functional unities in behavior. 

One needs, first, research time, resources and 
subjects enough to permit lengthy measurement 
of a large range of variables; second, a research 
team with talents in the direction of proceeding 
from general theoretical concepts about personal- 
ity to actual miniature-situational objective test 
designs; third, a sure touch in the finer statistical 
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issues in the area of multivariate analysis; and 
last, an electronic computer, well furnished with 
programs for the principal factor extraction and 
rotation procedures. The combination of mathe- 
matical competence and clinical insight in psy- 
chologists, as now trained, is far from common, 
and the other conditions are positively rare; so 
it is not surprising that there are fewer than half 
a dozen laboratories in the English-speaking 
world, and none that I know of outside it, where 
this basic reseatch is being intensively pursued. 

However, although such research centers can- 
not easily be expanded to the requisite number, 
there is no need for them to be so few as they 
are. Any large university psychology department 
should be able to organize an effective laboratory 
in this area. Viewed in broader terms of national 
effort there are unmistakable similarities to our 
backwardness in the area of intercontinental bal- 
listic missiles. Both have the pattern of insuffi- 
cient planning of funds and facilities for the scale 
of work required, and the lack of ability to bring 
representatives of different departments together. 
In our case the coordination failure has shown 
itself especially, until recently, in obtaining strong 
teams combining clinicians and multivariate sta- 
tistical experimenters. 

The second necessary objective in the organ- 
ization of research is the expediting of external 
cultural validation of these functional unities, 
once they are established and have had good 
tests set up for them in the laboratory. I believe 
it cannot be too much stressed that this is a task 
which cannot effectively be done by the same 
team or organization as had been designed for 
the basic internal validation just described. In- 
stead this is the proper field for the vaster group 
of professional, applied psychologists in clinical, 
educational, and industrial research. There is 
always a lag between the conclusion of labora- 
tory research and its use in the field, and one 
wonders if this lag could not, with a little better 
cooperation, be cut down from ten to five years. 
We all know the theory that if a man in the 
backwoods invents a better mousetrap the world 
will, in a few days, make a beaten track to his 
door, but in an age of advertising and vested in- 
terests he is more likely to be paid to bury the 
invention. 

Through the momentum of custom alone, and 
the ego involvements of personal prowess with 
ink blots or Binet, the majority of clinical and 
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educational psychologists are inclined to con- 
tinue with the instruments they were taught to 
use at college, though instruments of twice as 
high a validity may be open to evaluation in re- 
search reports. For example, many clinicians 
are only just beginning to realize that the factored 
questionnaire of today is something quite differ- 
ent from the ad hoc questionnaires of former 
years, and there is quite a good probability that 
it would give them better clinical diagnoses and 
prognoses than are obtainable from their cur- 
rent tools. Others, failing to realize the modern 
demands for research specialization which I have 
just stressed, seem to expect that the factor 
analysts will not only investigate structure but 
also supply them with the clinical validities of 
such tests, and they sit back and wait, ignoring 
their own vital role in test development. But 
the test construction itself is today a full time 
and highly specialized task. The amount of plan- 
ning, skill, and labor involved in factorizing lit- 
erally hundreds of variables, checking by replica- 
tion the factor structure in independent studies, 
and constructing unifactor scales from such vari- 
ables is enormously greater than that involved 
in the older style questionnaires and tests, which 
most of us could make up almost overnight. It 
is, however, true that the factor analyst has 
usually been content to dump his finished prod- 
uct before the clinician in the journals and to 
return to his computer and his laboratory. 
Unfortunately, even the applied psychologist 
who realizes his role in the teamwork of science, 
has been inclined to look at this abstract con- 
trivance with about as much enthusiasm and in- 
sight as a Bikini native looking at an atom bomb. 
He rightly fears that it is something which will 
involve radical changes in his mode of practice 
and thinking. Often he is inclined to defend 
himself from having to think in objective struc- 
tural concepts by saying that a factor is an arti- 
ficial mathematical monstrosity which will have 
no potency in his human clinical world. The 
result is that though a number of well factored 
tests highly relevant to clinical practice have be- 
come available over about the last five years, 
the activity which should have led to their ex- 
ternal validation has been utterly inadequate. 
The important point, however, is that on the 
few occasions when their external validity has 


CONTEMPORARY RESEARCH IN PERSONALITY 


been crucially tried, it has turned out to be very 
good. 

Turning from practice to basic theory, one 
notes that these external validities are vital to the 
full interpretation of personality factors and struc- 
tural relations for factors cannot be interpreted 
in the laboratory alone. In the case of the 16 
P.F. Test, external validation has come in rapidly 
and freely, leading to great strides in the inter- 
pretation of these factors which only five years 
ago had little attached to them except the letters 
of the alphabet—just like the nutritional vita- 
mins before the chemical analysis and synthesis. 
Thus, although factor C might be tentatively in- 
terpreted from its descriptive ratings and ques- 
tionnaire responses as ego strength versus ego 
weakness, this hypothesis only received the degree 
of confirmation it really required with the ensuing 
demonstration that it is significantly and posi- 
tively correlated with leadership in face to face 
groups, that it is higher in successful than un- 
successful psychiatric technicians, that it cor- 
relates with school achievement among students 
of the same intelligence level, that it is nega- 
tively correlated with accident proneness, that it 
is substantially negatively correlated with anxiety 
proneness, and so on. 

Similarly, the finding that high F factor or 
surgency-vs.-desurgency, is substantially positively 
correlated with being chosen and voted a group 
leader, that it has one of the principal loadings 
in the second order extraversion factor, that it 
increases with alcohol, that it declines steadily 
with age from adolescence to middle age, that its 
level is largely a product of environment rather 
than heredity, that it increases significantly under 
frontal lobotomy and under psychotherapy, pro- 
vided valuable extension of the original factor 
hypothesis that desurgency is a form of general- 
ized inhibition, associated with frontal lobe action 
and with frequency of punishing, repressive past 
experience. This degree of insight into its na- 
ture could never have been achieved from the 
direct content of the factor, either in ratings or 
in the questionnaire responses. 

Accordingly, the great need in the social or- 
ganization of research at the present moment is 
a concerted plan for taking all factor analytically 
well established personality source traits and hav- 
ing their social validities, their changes with age, 
their relevance to clinical prognoses, their edu- 
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cational predictive value, etc., systematically 
examined. No one clinical, counseling, or other 
applied psychological center can hope to do this 
alone or for all the factors. But a planned divi- 
sion of labor, in which certain laboratories or 
clinical centers make systematic studies of the 
life history of one factor and others of another 
could lead to an enormous increase in the prac- 
tical effectiveness of personality measurement in 
applied psychology in the next five or ten years. 

In conclusion, I hope I have given some con- 
vincing reasons why the construction of personal- 
ity measurement scales should be wedded to con- 
cepts of personality structure, and some evidence 
that the objective structuring of personality has 
come of age sufficiently to make this possible. 
How soon this marriage will be fruitful, in terms 
of major gains in the power and insightfulness 
of applied psychology, depends on how soon 
teachers of applied psychology cease thinking 
in terms of catalogues of tests and set out to 
teach tests and measurements as an epilogue to 
courses in personality. The psychology of struc- 
ture and growth comes first: the tests are merely 
an appendix to such an exposition. If after all 
this discussion, you were to ask me why I per- 
sonally prefer factor scales to other scales, e.8., 
simple homogeneous scales, I think I should have 
to say because the former are psychologically in- 
teresting and the latter are dull. When you are 
through with a complicated scaling ritual you 
have perhaps at best eased a neurotic compulsion; 
but with factor scales you can have a lot of fun 
finding how people tick. 
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outs IMPLICATIONS OF A 
SOCIAL LEARNING THEORY FOR 
THE PREDICTION OF GOAL 
DIRECTED BEHAVIOR FROM 
TESTING PROCEDURES * 


JULIAN B. ROTTER * 


Many sophisticated observers are aware that 
a wide gap exists between personality theory and 
the techniques or procedures used to measure 
personality variables. The low level of predic- 
tion of such testing procedures may well be a 


* Reprinted by permission from Psychological Re- 


view, 1960, Vol. 67, No. 5, 301-316. 
1] am indebted to Shephard Liverant for his help- 


ful comments and suggestions about this paper. 


112 CONTEMPORARY RESEARCH IN PERSONALITY 


function of the failure to apply the theories them- 
selves to the methods of measurement. Particu- 
larly, it is a failure to apply an analysis of the 
determinants of behavior in general to the specific 
test taking behavior of the subjects (Ss). 

The gap itself may be described as having three 
aspects. The first of these relates to the question 
of the constructs used in the theory and the con- 
structs which the tests were developed to measure. 
In many instances rather than devising tests which 
measure specific theoretical constructs which are 
carefully defined and for which the test behavior 
can be understood as a logical referent, the de- 
scriptive constructs used to classify test response 
do not logically relate to the new theoretical con- 
structs but are bent or twisted to measure the 
new variables. That is, test constructs which were 
used to classify test responses developed earlier 
are “translated” to be measures of the new vari- 
ables. Examples of this are use of Rorschach 
variables such as color, movement, and shading 
which arose from imagery-type theory, to measure 
such constructs as “ego strength,” “rigidity,” and 
“tolerance for ambiguity.” The Rorschach was 
not developed to assess such variables and in 
translating or twisting older methods of Ror- 
schach scoring to measure these variables it is 
quite likely that a loss of prediction results. 

A second aspect of this gap between personality 
theory and methods of measurement of person- 
ality lies in the testing procedure itself. For 
example, where the theory may emphasize the 
significance of differences in behavior in the pres- 
ence of authority figures vs. peers or males vs. 
females, the formal test procedure assumes no 
such variables are important. That is, no dif- 
ference in interpretation of test results follows 
from the fact that the examiner may have a dif- 
ferent social stimulus value in one case than in 
another or under one set of conditions rather than 
another. In such an instance although the theory 
itself recognizes (and experimental data such as 
Gibby, Miller, and Walker (73) and Lord (32) 
support) the importance of the effect on behavior 
of the nature of the social stimulus, the test pro- 
cedure itself does not take it into account. An 
example would be in the application of Murray’s 
theory (37) which sees behavior as a function 
of internal needs and environmental presses. 
Tests have been developed using this theory 
(Thematic Apperception Test, as clinically used; 


Edwards Personal Preference Schedule) which 
presume to measure the strength of various needs 
but fail to account for the test behavior as a 
function of the testing situation itself (an environ- 
mental press) as one of the variables determining 
the test behavior. Other characteristics of this 
discrepancy between theory and test taking pro- 
cedure will be discussed more fully later. 

A third aspect of the gap lies in the area of 
inference from test behavior. The issue here is 
that there is an absence of logic or contradiction 
in the assumed relationship between what the S 
does, or test behavior, and what is inferred from 
such behavior. Peak (43) and Butler (6) among 
others have discussed this problem earlier. Jessor 
and Hammond (22) have noted such a gap in 
the inferences made from the Taylor Anxiety 
Scale. Another example could be drawn from 
the Edwards Personal Preference Schedule (11) 
in which Ss are asked to state their preferences 
for different kinds of goals but there is no the- 
oretical basis provided to allow one to make 
predictions about nontest behavior from such 
preferences. Of course, it can be assumed that 
the preferences have some one-to-one relation- 
ship with some criterion behavior, but it is un- 
likely that even the test authors would make a 
theoretical commitment. In other words, it is 
not clear exactly what can be predicted or should 
be predicted from the test responses. Individuals 
using such tests, however, can defend themselves 
by stating that prediction is after all an empirical 
matter and one has to find out what can or should 
be predicted. It is likely, however, that the con- 
struction of tests which are systematically or 
theoretically pure, in that they are devised to 
measure specific variables or to make specific 
predictions, with the method of measurement and 
inference consistent with the theory will ultimately 
provide much better predictions of behavior as 
well as a test of the utility of the theory itself. 

The purpose of this paper is to explicate some 
of the implications of a social learning theory 
of personality for the measurement of personality 
variables. The particular point of emphasis is 
the measurement of goal directed behavior con- 
ceptualized in social learning terms as need po- 
tential. Secondarily, the paper aims at illustrat- 
ing the nature of the relationship between testing 
procedures and inference about behavior more 
generally. 


PERSONALITY ASSESSMENT 


In social learning theory (47) the basic formula 
for the prediction of goal directed behavior is 
as given below: 


BPas.R = l(Errası & RVas,) a) 


The formula may be read as follows: The po- 
tential for behavior x to occur in Situation 1 in 
relation to Reinforcement a is a function of the 
expectancy of the occurrence of Reinforcement 
a following Behavior x in Situation 1 and the 
value of Reinforcement a, in Situation 1. Such 
a formula, however, is extremely limited in ap- 
plication for it deals only with the potential for 
a given behavior to occur in relationship to a 
single specific reinforcement. The prediction of 
responses from personality tests requires a more 
generalized concept of behavior and the formula 
for these broader concepts is given below: 


BP(z-n) sun) Ram 


= f(E@-n),sa-n)Rta-n) 
& RV amag) (2) 


This may be read: The potentiality of the func- 
tionally related Behaviors x to n to occur in the 
specified Situations 1 to n in relation to potential 
Reinforcements a to n is a function of the ex- 
pectancies of these behaviors leading to these 
reinforcements in these situations and the values 
of these reinforcements in these situations. For 
purposes of simplicity of communication, the 
three basic terms in this formula have been 
typically referred to as need potential, freedom 
of movement, and need value as in the third 
formula below: 


NP = f(FM & NV) 8) 


In this formula the fourth concept, that of the 
psychological situation, is implicit. The vari- 
ables referred to above and operations for meas- 
urement have been defined and further explicated 
in a previous publication (47). 

In order to illustrate the social learning theory 
implications for measurement of personality and 
for the measurement of goal directed behavior, 
it seems expedient to consider three basic ap- 
proaches to this problem based on the number of 
determinants used theoretically to predict such 
goal directed behavior and the problems, limita- 
tions, and advantages of each approach. 
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STRENGTH OF NEED AS A BASIS FOR PREDICTING 
BEHAVIOR 


Although many esoteric systems of prediction 
utilize essentially the strength of need, drive, or 
instinct approach, this method can be described 
as the simplest or least complicated approach. 
Basically a series of constructs are formulated 
more often on an a priori basis than empirically, 
or at least on a presumed clinical rather than 
experimental basis. These descriptive terms may 
refer to instincts, drives, needs, factors, entities, 
or vectors of the mind (i.e., the Minnesota Multi- 
phasic Personality Inventory, Edwards Personal 
Preference Schedule, Rorschach, Humm-Wads- 
worth, etc.). They all have in common that there 
is more than one basic characteristic, that these 
two or more characteristics are in some way 
measurable along a continuum and presumably 
the individual’s behavior can be predicted from 
the characteristics which are “stronger” and the 
characteristics which are “weaker.” 

Sometimes the personality disposition can be 
predicted from the strength of other constructs 
according to either simple or complex statements 
of relationship formally postulated, hypothesized, 
or informally asserted. These relationships can 
become quite complex as in psychoanalysis or 
quite esoteric as in Szondi’s explanation that mo- 
tivated behavior is a result of the interaction of 
dominant and recessive genes. Because the 
methods of measurement in some instances can- 
not be direct, as in the assessment of unconscious 
drives in psychoanalysis, an impression of great 
complexity is given but regarded entirely from 
the point of view of the prediction of behavior, 
the system may still have a simple character. 
The potential for a given kind of behavior is 
still directly predictable from the strength of the 
drive, instincts, needs, or energies postulated. 

There is another form of this model in which 
the various drives, dispositions, or needs are con- 
sidered to interact. For example, the individual 
may be conceived of as being controlled by his 
intellect and his emotions, but his behavior must 
be understood in light of the interaction of these 
two forces with a third variable, the will as in 
the Rorschach Test (45). Again, this makes 
complex the caleulation of strength or weakness 
but does not change the overall method of mak- 
ing predictions. Whether dealing with will, in- 
tellect and emotions or ego, superego and id, the 
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effect of interaction is only to increase or de- 
crease the tendency of one of the needs to func- 
tion or to strengthen or weaken one of them or 
perhaps ‘to produce a fourth or fifth additional 
need. The basic method of prediction stays the 
same although the calculation of strength or 
weakness in such a model becomes more difficult. 
The obvious problem, of course, for such rudi- 
mentary method of prediction is how to predict 
anything at all. If a system included five in- 
stincts or needs and these are ordered on some 
metric system from high to low, does one assume 
that the person will always act in the fashion to 
be predicted from his strongest or highest need? 
If an $ is more oral than anal, does he always 
act in an oral fashion? Actually the most logical 
assumption in regard to any specific instance is 
that he will always act the same way. One might 
presume on a statistical basis, as it is sometimes 
done, if the individual is at the 70th percentile on 
Need A and at the 30th percentile on Need B, 
70% of the time he would act in one fashion 
and 30% of the time in the other over some 
undefined period of time. However, the only 
sensible statistical or logical prediction in any 
specific instance, if no other variables are con- 
cerned, is that he would act in accordance with 
the higher need. This might still give fairly good 
prediction if only 2 variables are involved, but 
if 20 variables are involved and many of them 
are very close in value or “strength,” then the 
amount of error begins to increase. In fact, it 
becomes a problem to predict even slightly above 
chance and, indeed, except for some limited and 
highly controlled experimental situations, this is 
the problem in psychology now. A recent il- 
lustration of this failure is reported in a care- 
fully controlled study by Little and Shneidman 
(29) who failed to find much relationship be- 
tween interpretations of psychological tests (Ror- 
schach, MAPS, TAT, and MMPI) and anamnes- 
tic data. Loevinger (37) summarizing the pre- 
dictiveness of individual tests in the recent An- 
nual Review of Psychology states, “To date the 
only tests which meet standards for individual 
prediction are those of general ability” (p. 305). 
Previous reviewers have made similar statements. 
Another problem which arises in the prediction 

of behavior with this simple model is that it soon 
becomes apparent that the strength of a wish, 
need, or drive to achieve some goal such as being 
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taken care of, obtaining love, or injuring some- 
one is not a good predictor of the occurrence of 
behavior directed towards the achieving of that 
goal. To some extent this problem can be dealt 
with by the notion of interaction of needs, but 
usually in order to account for the discrepancy 
between need or wish and behavior, constructs of 
the same order do not provide sufficient explana- 
tory basis. It is actually necessary to postulate 
some other kinds of internal constructs to ac- 
count for the discrepancy between what might 
be called wish, desire, need, or instinct and ob- 
servable behavior. 

In the measurement of these need strengths 
all varieties of tests and devices have been used. 
To some extent, the personality questionnaire is 
utilized a little more by people adopting such a 
predictive scheme as that described above, but 
also projective tests, observation, interviewing, 
and many other techniques of personality meas- 
urement have been used in this fashion. Test 
construction methodologies may currently be 
more sophisticated in that they control for sacial 
desirability of items, motivation, faking, lying, 
and inability to understand directions. Recent 
tests may also rely on purification of factors, 
cross-validation, or item analysis. However, with 
all these “modern improvements” in test design 
one is still left with a series of figures which are 
of doubtful utility for the actual prediction of 
behavior at a level satisfactory for either prac- 
tical application or for the clarification of the- 
oretical issues. 


THE ADDITION OF AN EXPECTANCY CONSTRUCT IN 
THE PREDICTION OF GOAL DIRECTED BEHAVIOR 


The absence of additional variables explicitly 
defining the relationship of need and behavior 
appears to be not so much a matter of simple 
theoretical structure as it is merely the absence of 
any real explicit theory about human behavior. 
The development of a predictive model which 
recognizes the discrepancy between need and 
behavior and tries to systematically take it into 
account represents a second level of sophistica- 
tion. 

At an earlier date perhaps psychoanalysis dealt 
with this problem most effectively in introducing 
concepts such as repression, sublimation, sup- 
pression, defense, reaction formation, etc., to ac- 
count for the discrepancy between observed be- 
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havior and the presumed internal drive, need, or 
instinctual urge. 

At this level of theorizing some systematic 
variable is added to the internal motivational 
state in order to predict behavior. Perhaps an- 
other way of saying this is that in addition to 
some measure of preference or value of a spe- 
cific goal another systematic concept must be 
introduced, not only in a hit-or-miss fashion but 
perhaps directly into our assessment procedure. 
The psychoanalytic solution has been criticized 
because many specific concepts are introduced 
to account for a discrepancy between drives, 
urges, or needs and observable behavior, but these 
concepts are not readily measurable. In addi- 
tion, one does not know when one explanation, 
e.g., reaction formation, is the explanatory con- 
cept or another such as sublimation or simple 
repression. 

‚In social learning theory (47) it is presumed 
that the relationship between goal preference (re- 
inforcement value) and behavior can be deter- 
mined only by introdueing the concept of the 
individual’s expectancy, on the basis of past 
history, that the given behavior will actually lead 
to a satisfying outcome rather than to punish- 
ment, failure, or, more generally, to negative re- 
inforcement, Since the early formulations of 
Tolman (51), expectancy theories have become 
more and more widely relied upon both in human 
learning and personality theories. It is possible 
to conceptualize more specific constructs such as 
repression and reaction formation as only special 
cases of an expectancy for severe punishment and 
that a more general relationship holds which in- 
cludes perhaps all of these and also expectancies 
for punishment or failure of which the indi- 
vidual is quite aware. For example, an individual 
may wish very much to be a good dancer and to 
dance with members of the opposite sex. He 
makes no attempts, however, to dance at a party 
or a dance because he can tell you “but I look 
like a fool when I go out on the dance floor.” 
We need, in other words, to introduce no specific 
construct involving the “unconscious” to explain 
the discrepancy between his wish and his be- 
havior. The S may or may not be aware 
of expectancies which influence his behavior. 
Whether or not he is aware of them may affect 
the degree to which these expectancies change 
with new experience as well as other variables. 


The degree of awareness may be an important 
additional variable; however, the level of ex- 
pectancy itself is the broader variable which 
bears a direct relationship to the potential oc- 
currence of a specified behavior. 

The question arises, then, of how one takes 
into account such factors in an actual testing 
situation. It could be said that no one is really 
so naive as to believe that the strength of an 
internal motivational condition or need is a di- 
rect predictor of behavior. Somehow or other, 
whether or not the individual had learned a given 
behavior or expected it to work is also an im- 
portant aspect of prediction. However, more 
often than not this aspect of prediction has been 
treated as a source of error, something to be 
eliminated if possible, both in testing or in the 
validation of a test instrument. As a matter of 
fact, many currently used instruments attempting 
to assess the strength of motives, drives, or needs 
are usually confounded. Although they may be 
quite sophisticated in methodology, the test items 
or the test variables usually refer in part to what 
the individual did, i.e., overt behavior (“I fre- 
quently lose my temper”), in part to what he 
wished (“I would like to have more friends”), 
and in part to what he expected to be the out- 
come of his own behavior (“I feel that other 
people do not appreciate my good intentions”). 
To some extent these impure items probably add 
to prediction by providing more than one kind 
of referent for behavior, but the nonsystematic 
way in which they are used also limits prediction. 

In trying to predict goal directed behavior from 
tests, two possibilities are open. One of these is 
to attempt to predict behavior from other be- 
havior which presumably is functionally or pre- 
dictively related to the test behavior. What this 
involves is analyzing test situations as behavioral 
samples under a given set of test conditions. For 
example, to assess dependent behavior one could 
use direct observation techniques, perhaps in 
problem solving situations, in which the S is 
scored for help-seeking behavior (41). In ques- 
tionnaires the items should refer to what the S 
does, not to what he expects, wishes, or feels. 
The use of behavior samples for predictions or 
the regarding of all kinds of tests including pro- 
jective tests essentially as samples of behavior to 
be analyzed in terms of what the S does under 
these conditions has been described elsewhere 
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(47). Like the work sample test in industry it 
undoubtedly provides the best prediction to a 
limited specific behavioral criterion since it re- 
quires the fewest intermediate constructs and the 
fewest assumptions regarding the action of other 
variables. en 

However, there are many problems, both the- 
oretical and clinical, when it is important to 
break down this behavioral measure into its major 
determinants of reinforcement value and ex- 
pectancy for the occurrence of the reinforcement. 
For example, in psychotherapy an understand- 
ing of how some behavior or group of be- 
haviors may be most readily changed requires 
analysis into at least these two components. Even 
when strictly concerned with predictions of be- 
havior in a broad band of life situations, rather 
than change, analysis into separate determinants 
may provide greater prediction than a work 
sample or behavioral technique. In this second 
alternative it is important either to control or 
systematically vary the other variable or measure 
both. For example, Liverant (30) has measured 
some needs by presenting pairs of items involving 
goal preference matched for social desirability, 
and Jessor and Mandell (34) are developing a 
similar test to measure expectancy for success 
in satisfying the same needs. 

In using projective material such as TAT, 
Crandall (8) has demonstrated that expectancy 
for need satisfaction, for which the term free- 
dom of movement is used in social learning 
theory, can be reliably measured by selecting 
particular kinds of referents. The work of Mus- 
sen and Naylor (39), Kagan (23), and Lesser 
(26) gives strong evidence that the relationship 
between theme counts of aggression on the TAT 
and overt aggressive behavior depends to a large 
extent on whether or not that overt behavior is 
socially acceptable in the Ss’ own homes or social 
climates. The relationship between theme count 
and overt aggressive behavior appears to hold 
only when the Ss do not have a high expectancy 
that aggressive behavior will be punished. Atkin- 
son and Reitman (3) report that in a number 
of studies of need achievement, it has become 
clear that prediction of behavior is enhanced if, 
in addition to taking a measure of need achieve- 
ment based upon achievement theme counts in 
TAT-like material, an additional measure of ex- 
pectancy for success is also taken into account. 


In dealing with this type of testing material 
the recently published study of Fitzgerald (12) 
provides a more systematic analysis. Using a 
highly reliable sociometric technique of nomina- 
tion of fraternity brothers as his behavioral 
criteria and dealing with the need dependency, 
Fitzgerald found no relationship between theme 
counts and overt behavior. Presumably, depend- 
ent behavior is not socially acceptable among 
male college students. He had, however, inde- 
pendent interview ratings of need value, that is 
preference or desire for dependency satisfactions 
and of freedom of movement, or expectancy that 
behavior directed toward achieving dependency 
would be satisfied. 

He found that by using these measures he did 
obtain a significant correlation between theme 
counts and the discrepancy between need value 
and freedom of movement. More specifically, 
what he called a conflict score or score indicating 
the degree to which the individual preferred de- 
pendency or desired dependency satisfactions but 
expected that he could not achieve them corre- 
lated with theme counts for dependency.? On the 
other hand, an Incomplete Sentences Blank meas- 
ure of dependency which utilized behavioral re- 
ferents as well as reinforcement value and ex- 
pectancy referents did show a low but significant 
relationship of the number of completions deal- 
ing with dependency with both the sociometric 
and interview measures of need potential or actual 
dependent behavior. Although an actual analysis 
was not made, it seems very likely that the reason 
for the correlation in the case of the ISB and 
not the TAT is that at least some of the ISB com- 
pletions were descriptions of actual behavior. 
Possibly a purer measure of behavior would have 
shown a greater relationship to the sociometric 
and interview rating assessment of actual de- 
pendent behavior in life situations. 

Should we build two instruments or at least 
two sets of testing operations to separately assess 
need value and freedom of movement, or should 
we attempt to use behavioral measures in order to 
make our predictions about behavior, we would 
still be faced with the problem of predicting in 


2 Whether or not the relationship between theme 
count and high reinforcement value and low ex- 
pectancy is general is not yet known, It appears at 
this time to possibly depend on whether or not the 
test material and testing situation is conducive to the 
free expression of fantasy. 
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a specific situation. Given measures of six be- 
havior potentials, however arrived at, the prob- 
lem remains that of knowing which of these is 
likely to be the behavior preferred in some spe- 
cific situation. One is again forced to predict 
that the behavior with the highest potential al- 
ways occurs and one is limited again in predic- 
tion to a very low level of accuracy. In the 
laboratory situation where we can reduce the 
possible alternatives to two, significant, although 
not predictive, results are possible. In the life 
situation where the alternatives are very fre- 
quently of a large order, the question arises of 
whether or not any useful prediction is possible. 
This leads us to a third level of sophistication, 
one in which the psychological situation is one of 
the variables on which prediction is based. 


THE PSYCHOLOGICAL SITUATION AS A THIRD 
DETERMINANT OF GOAL DIRECTED BEHAVIOR 


Few would deny that the psychological situa- 
tion will affect the potential of occurrence of any 
behavior or class of behaviors. However, the fact 
that behavior will vary from situation to situa- 
tion is most often treated as a source of error, 
something to be avoided. If possible, one should 
construct tests or find personality variables which 
rise somehow above the situation. It is probably 
no exaggeration to say that thousands of hours 
of wasted work have been done by psychologists 
in the vain goal of finding either tests or variables 
which would, somehow or other, predict regard- 
less of the situation in which the test is given or 
regardless of the situation in which the predicted 
behavior is expected to occur. 

There are three separate problems here which 
will be discussed as one basic problem. The first 
problem is to understand the effect of the testing 
situation on test results. For example, Sarason 
(49) has provided an excellent discussion of 
some of the influential situational variables in 
intelligence testing. The second problem is to 
understand the nature of the criterion situation 
which affects the criterion measures. The third 
and ultimately most important problem is to de- 
vise our tests with full consideration of the nature 
of the test situation in order to predict behavior 
in other situations for which the test was con- 
structed. In other words, we need to devise tests 
not to predict personality or needs or behavior in 
the abstract but in specified situations or classes 
of situations if we want high prediction. We 


need to know and take into account the dimen- 
sions of situation similarity in devising test pro- 
cedures. 

Cronbach (9) has criticized the failure to re- 
gard differentially the criterion situations in which 
tests are applied. In regard to the test situation 
we have only attempted to standardize the test 
procedure but usually have ignored the importance 
of the social context in which the test is given. 
Perhaps the most important thesis of this paper 
is that the psychological situation needs to be 
understood and systematically considered in our 
predictive formula, not treated as a source of error 
or something that can be ignored because part of 
the total situation is standardized. 

Recently there have been a number of studies 
which demonstrate that almost all tests are sub- 
ject to faking, to instructional variation, to exam- 
iner influences, to testing conditions, etc., regard- 
less of the type of test (4, 14, 15, 40, 48). The 
general inference drawn from these studies is that 
the tests are poor. Actually the implication of 
such findings is that we are making inefficient use 
of our tests. If the test situation for many per- 
sonality tests is one in which social conformity 
or acceptability is easily achieved and no other 
satisfactions are given up in achieving acceptabil- 
ity, then for some purposes this motive should be 
controlled. However, the test situation can also 
be utilized to measure the importance of social 
conformity for the individual. What we call 
faking is only our recognition of the fact that the 
S is taking the test with a different purpose or goal 
than the one the examiner wants him to have, 
For some purposes it might be important to un- 
derstand what kind of goals he exhibits in this 
kind of situation. More often than not we simply 
try to control what we should be studying. For 
example, in giving intelligence tests it might be 
better to study systematically the effect on per- 
formance of encouragement and discouragement 
rather than to attempt some mythical neutral atti- 
tude which is presumably the same for every ex- 
aminer. Knowledge of the effects of situational 
variations would be of particular value in under- 
standing the frequently diverse and contradictory 
results of apparently similar research investiga- 
tions. For example, Henry and Rotter (18) 
found that large, predicted differences were ob- 
tained between two comparable groups on the 
Rorschach test if one group was reminded before 
the regular instructions that the test had been 
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used frequently to study psychopathology. An 
obvious implication of this study is that investi- 
gators using this same test in the college labora- 
tory and in the clinical setting may well produce 
diverse results. 

Another example of how the situation can be 
used in testing is provided by the patient who is 
being assessed for possible benefit from psycho- 
therapy. If the clinic or hospital can provide 
both male and female therapists and also therapists 
who rely on support and direction as well as thera- 
pists who remain distant and passive, then the 
testing procedures can be varied so that those 
situational influences are present. The testing 
could provide information to indicate what kind 
of therapy and what kind of therapist is likely to 
provide this patient with the most efficient condi- 
tions for relearning. For more conventional pur- 
poses of clinical testing, it is still more important 
to know under what conditions the patient behaves 
in a paranoid fashion and under what conditions 
he does not, then it is to know how many percen- 
tiles of paranoia he has. 

Not only can the test situation itself be analyzed 
as a behavioral sample but situational referents 
can be incorporated into the content of items by 
systematic sampling. For example, questionnaire 
items can deal with the conditions under which 
the S$ feels tense, nervous, happy, has headaches, 
etc. as Mandler and Sarason (35) have done for 
some intellectual test taking situations. Similarly, 
projective methods, particularly TAT-type tests, 
can systematically vary the situation through the 
selection of test stimuli as has been done by 
Crandall (8) and McClelland, Atkinson, Clark, 
and Lowell (33). More recently Murstein (38) 
has suggested a conceptual model for stimulus 
variation with thematic techniques. 

The many studies indicating marked effects of 
testing conditions suggest that it is of great impor- 
tance in the publication of any test that descrip- 
tions of the differences in test results that are 
likely to be associated with different kinds of 
testing situations be provided. No test can be 
adequately understood unless the data regarding 
its standardization or use include systematic de- 
scriptions of the differences in test results which 
are a function of different kinds of testing condi- 
tions and different kinds of purposes in taking 
the test for similar samples of Ss. Only when we 
know whether an S is likely to produce different 
test results when he is taking the test to demon- 
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strate how imaginative he is as compared to tak- 
ing it to prove that he needs help will we be ade- 
quately able to understand the meaning of test 
results and to predict future behavior from them. 

There have been personality theorists who have 
made much of the importance of the individual's 
life space. Kantor (24) was one of the first to 
emphasize that the basic datum of psychology is 
the interaction of an individual and his meaning- 
ful environment. For Kantor, people do not have 
internal characteristics in the same sense as for 
other theorists; rather they have a reactional bi- 
ography of interactions with the environment. 
Lewin (28) has also emphasized the importance 
of the life space or psychological situation in the 
determination of human behavior. Brunswik (5) 
has repeatedly called for analyses of and sampling 
of psychological situations for predictive purposes. 
Helson (/6) has applied his theory of Adaptation 
Level to social psychology stating that the effect 
of the total field can be quantified by careful 
ordering of the field of exposed stimuli. Recent 
concern with the importance and need for system- 
atic study of situation variation has been expressed 
by Allport (/) and Cronbach (/0). 

In a more limited and less systematic way, psy- 
choanalysis has suggested, in a few areas, that 
certain kinds of goal directed behavior depended 
upon the psychological situation. This is done 
in making distinctions between the individual's 
potential response to authority figures vs, nonau- 
thority figures and males vs. females. Beyond 
this, little systematic analysis of differences in life 
situations has been made by the traditional ana- 
lyst. Murray’s (37) formulation of the nature 
of personality stresses that behavior is a function 
of the interaction of an individual with a psycho- 
logical situation which he felt could be categorized 
as “press.” Ata more specific level Atkinson and 
Raphelson (2) have shown the value of including 
situational variables in studying achievement be- 
havior. This general point of view has also been 
represented in sociology by Thomas (50) and 
Coutu (7). 

In social learning theory, it has been hypothe- 
sized that the situation operates primarily by pro- 
viding cues for the § which are related to the 
magnitude of his expectancies for reinforcement 
for different behaviors. The effect on the value 
of the reinforcement itself operates through ex- 
pectancies for associated or subsequent reinforce- 
ments which may differ from situation to situa- 
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tion. It has also been hypothesized that situations 
may be usefully categorized in terms of the pre- 
dominant reinforcements as culturally determined 
for any large or small culture group. There are 
many other possible ways of categorizing situa- 
tions depending upon the predictive purposes in- 
volved. Methods of determining generality or 
determining the dimensions of similarity among 
situations have been described in an earlier 
paper (48). 

Two illustrative studies of an increasing num- 
ber of experimental analyses of behavior which 
vary both internal characteristics and the psycho- 
logical situation systematically in the same study 
are described below. These studies follow the 
basic paradigm that the presumed relevant indi- 
vidual (personality) and situation (experimental 
manipulations) variables can be observed simulta- 
neously and their interaction studied. 


ILLUSTRATIVE STUDIES VARYING BOTH THE 
SITUATION AND INTERNAL CHARACTERISTICS 


A recent doctoral dissertation by James (20) 
illuminates very clearly the potential of greater 
prediction when both the situation and the inter- 
nal characteristic are varied in the same study. 
The behavior being studied by James involved a 
variety of learning variables, including acquisition, 


3 Several writers have pointed out the difficulty of 
identifying situations independently of behavior. That 
is, how can one describe a situation as one would a 
physical stimulus independently of the S’s response? 
The problem is not different from that of describing 
stimuli along dimensions of color although perhaps 
yastly complicated in social or other complex situa- 
tions. In the case of color stimuli ultimately the 
criterion is a response of the scientist or observer, 
sometimes a response to an intermediate instrument, 
and one that is at the level of sensory discrimination 
and so leads to high observer agreement. In the case 
of the social situation, the level of discrimination is 
common sense based on an understanding of the 
culture rather than the reading of an instrument. As 
such, reliability may be limited but still be sufficiently 
high to considerably increase prediction. In this way 
specific situations could be identified as school situa- 
tions, employment situations, girl friend situations, 
etc. For the purpose of generality various kinds of 
psychological constructs could be devised to arrive at 
classes of situations which have similar meaning to 
the $. The utility of such classes would have to be 
empirically determined depending on the $’s response. 
The objective referents for these situations, which 
provide the basis for prediction, however, can be 
independent of the specific S. That is, they can be 
reliably identified by cultural, common sense terms. 


changes or shifts, extinction, generalization, and 
recovery of verbalized expectancies for gratifica- 
tion. Two general hypotheses were involved in 
this study growing out of previous work by Lasko 
(25), Phares (44), Neff (42), and James and 
Rotter (27). Hunt and Schroder (19) have also 
dealt with what appears to be a related variable. 
The first of these hypotheses might be stated as a 
situational one. That is, that the nature of a 
learning process differs depending upon whether 
or not the situation is one in which the reinforce- 
ments that occur are a direct outcome of some 
internal characteristic of the individual such as 
skill, a physical characteristic, or whatever, versus 


‚a situation where the reinforcements are essen- 


tially controlled by someone else or by chance 
or by conditions or powers beyond the S’s control. 
Perhaps a good example of the latter would be a 
dice game or the winning of a door prize or 
having soup spilled on one because a waiter 
tripped, etc. James utilized line and angle match- 
ing tasks reinforcing each S positively on his 
guesses in six of the eight training trials. He spe- 
cifically hypothesized when the situation is struc- 
tured in such a way that the S expects the occur- 
rence of reinforcements to be beyond his control 
or partly beyond his control, increments and dec- 
rements in expectancy for gratification as a result 
of experience are smaller, the number of unusual 
shifts, that is, shifts up after failure or down after 
success, are greater, extinction is faster, and there 
is less generalization from one task to another 
and greater recovery following extinction. 

The measurement of individual differences in 
this study followed from the previous work of 
Phares which suggested that individuals can be 
differentiated in the degree to which they see the 
world and the things that happen to them as con- 
trolled by others or as determined by chance or 
unpredictable forces. The second hypothesis, 
then, was that all the differences which would 
occur as a result of the situational conditions 
would also be true of individuals within all groups 
as a function of their general attitude towards 
“control of reinforcement.” 

In order to predict the individual differences in 
attitude, James enlarged and revised the question- 
naire first devised by Phares. This was given to 
all Ss at the end of each experiment. The results 
are most striking. All of the predicted outcomes 
hypothesized above resulting from the differences 
in instructions or situations were obtained and all 
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were statistically significant with the exception of 
recovery following extinction, which showed a 
strong trend in the predicted direction. Similarly, 
within each group the individuals high as com- 
pared to low on the questionnaire differed signifi- 
cantly in exactly the same way as did the groups 
themselves as a result of the different instructions 
or situations presented. Although individual pre- 
diction was not the concern of this investigation, 
it is quite clear that a simple formula could be 
devised which could predict all of the learning 
variables involved in this study with a fair degree 
of accuracy. Certainly it is clear that a greater 
degree of accuracy is possible when both the situ- 
ational and individual variables are taken into 
account. Perhaps far more important, this study 
indicates that various experimental paradigms in 
studying human learning are likely to produce 
different kinds of results. Whether or not a given 
learning task is one in which the S feels that suc- 
cess is dependent upon the experimenter’s manipu- 
lation (for example, when he is expecting to pre- 
dict a random sequence of red or green lights) 
or is the result of his own skill provides a crucial 
difference in the nature of the learning process 
itself. : 

The study of James, however, does not provide 
a satisfied feeling that it illustrates all of the prob- 
lems of prediction involving both the individual's 
characteristics and the situation from which the 
prediction is made. It gives an almost too simple 
picture of the interaction of these two variables. 
Another recent dissertation by Moss (36) suggests 
that this relationship can be more complex, and 
illustrates more clearly the effect of the testing 
situation on more commonly used types of assess- 
ment procedures. 

Moss studied a general behavioral characteristic 
which he called cautiousness. Essentially this was 
defined as the avoidance of risk, the selection of 
the safest alternative in a situation where failure 
or negative reinforcement was possible. He var- 
ied the situation by reacting differently to three 
groups following the administration of a question- 
naire which he described as a measure of social 
acceptability. One group was shown false norms 
at the conclusion of the questionnaire that indi- 
cated that they were in the ninetieth percentile of 
social acceptability for a college group. Another 
group of Ss was shown that they were at the tenth 
percentile, and a third group was given no infor- 
mation about the results of this supposed test of 


social acceptability. He hypothesized that cau- 
tious behavior would increase with negative rein- 
forcement. Immediately following this procedure, 
the Ss were given two projective type tests and 
behavior on these tests was analyzed as to degree 
of cautiousness. 

Prior to the giving of the “social acceptability” 
questionnaire the Ss had been tested on the level 
of aspiration board. Behavior on the level of 
aspiration board (46) was categorized into cau- 
tious or noncautious patterns. The general tend- 
ency to seek safe alternatives in the obtaining of 
satisfactions then was measured in a situation in 
which the S himself has some control over failure 
or success. 

One kind of behavior studied was that of sort- 
ing figures taken from the MAPS test. The S was 
presented the figures and asked to sort them into 
two piles any way he wished. The procedure was 
repeated a second time asking for a different kind 
of sort, and a third time. The sorts themselves 
were characterized as being safe or cautious in 
that they dealt with highly objective characteristics 
of these figures, or less safe in that they dealt with 
characteristics which were more abstract or had 
to be read into many of the figures. For example, 
sorts based on personality characteristics were 
considered as noncautious as opposed to safer or 
more cautious sorts such as those into groups of 
men and women, children and adults, Negroes 
and whites, etc. 

A second kind of behavior studied was the S’s 
response to a series of four TAT pictures. In 
this case the S’s stories were treated as Weisskopf 
(52) has treated them with her measure of tran- 
scendence. A cautious or safe interpretation was 
one sticking close to the characteristics of the 
picture and one in which the theme itself was a 
common one. 

Moss found some differences among his groups 
in the direction he had hypothesized, that is, that 
the threatened group, the group that was told 
that it was at the tenth percentile, showed greater 


4 Cautious and noncautious behavior was character- 
ized according to the patterns described by Rotter 
(47, pp. 318-324). Patterns 1 and 3 were considered 
as noncautious and 2, 4, 7, and 8 as cautious patterns. 
The latter group are characterized by a variety of 
techniques presumably aimed at avoiding failure to 
reach explicit goals. Patterns 1 and 3 are character- 
ized by higher expectancies than performance but 
within “normal” bounds and consequently a higher 
number of failures to reach one’s estimates. 
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cautiousness than the other two groups. The 
differences between groups, however, although 
consistently in the direction he predicted, only 
approached significance and were not large, 
However, when Moss divided his Ss within groups 
into cautious and noncautious on the basis of 
their level of aspiration patterns, he found some 
highly significant differences. The cautious Ss 
showed no significant differences among the three 
conditions. However, the noncautious Ss showed 
significant differences between conditions. That 
is, noncautious Ss in one condition responded dif- 
ferentially from noncautious Ss in another condi- 
tion. These differences were primarily due to 
greater noncautious behavior in the no-informa- 
tion group. Differences in test behavior between 
cautious and noncautious Ss were also highly sig- 
nificant on both tests within the no-information 
group but not in the other groups. 

In spite of the complexity of this study, a few 
findings seem relatively clear from an analysis of 
group means as well as significance of differences. 
Ss who were cautious on the level of aspiration 
test, which is a somewhat free situation, were also 
cautious in the other test conditions. Of course, 
this does not mean that they were cautious in 
situations which were not perceived by them as 
evaluation situations. On the other hand, Ss who 
were noncautious on the level of aspiration test 
appeared to maintain this greater risk taking be- 
havior under test situations where no information 
about results was given. However, when they 
were negatively reinforced, they became more 
cautious and they also did not appear to be dif- 
ferent from cautious Ss under conditions where 
they were quite successful. Perhaps this is related 
to the presumed conservatism which follows from 
success. There was no consistent prediction from 
the level of aspiration situation to the two “pro- 
jective tests” which could be made without con- 
sidering the situation. In at least two of the 
situations the cautious Ss were not significantly 
different from the noncautious Ss. On the other 
hand, the psychological situation or the three dif- 
ferent situations seemed to have no effect on the 
cautious Ss. Only in the interaction of the non- 
cautious Ss with the situational variables was pre- 
diction possible from the level of aspiration test. 

A similar type of result to that of Moss was 
recently reported by Lesser (27). Lesser found 
that intercorrelations among various measures of 
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aggression were significantly higher under experi- 
mental conditions of low anxiety than under con- 
ditions of high anxiety about aggression. 

James’ results suggest a rather simple relation- 
ship between dispositional and situational varia- 
bles, but it is clear from the study of Moss and 
other studies that a simple additive or multiplica- 
tive relationship will not always describe the na- 
ture of the interaction. An important implication 
of this principle is that there is a general lack of 
efficiency in research studies in which only one 
set of variables, that is only dispositional or situa- 
tional, are systematically varied, since the con- 
clusions of the two sets of studies cannot be put 
together in a simple fashion. Unless both kinds 
of variables are systematically varied within the 
same investigation, both an understanding of the 
determinants of behavior and the prediction of it 
may suffer. 

A striking example of the importance of study- 
ing the effects of dispositional and situational in- 
fluences simultaneously is provided by Helson, 
Blake, Mouton, and Olmstead (17). In applying 
Adaptation Level theory to a study of attitude 
change they found important interactional effects 
when situations were varied in external influence 
pressure and individuals were distributed on a 
measure of ascendancy-submissiveness. 

It is true that many of the above propositions 
are obvious. Most psychologists recognize that 
there is a difference between overt behavior di- 
rected toward certain goals and the desires that 
individuals have to obtain these goals. Similarly, 
most psychologists know that the psychological 
situation is a determinant of the occurrence of a 
given behavior. The thesis here, however, is 
not merely that this is the case but that all of 
these variables must be ordered and studied sys- 
tematically, in order to make predictions. 


SUMMARY 


The major contention of this paper has been 
that the prediction of goal directed behavior of 
human subjects from test procedures has been 
and will continue to be at an extremely low hit- 
or-miss level because of inadequate conceptual- 
ization of the problem. Findings are frequently 
not replicatable because of the failure to systemati- 
cally differentiate behavior, reinforcement value, 
and expectancy as internal variables and to recog- 


122 


nize that these variables are affected by the psy- 
chological situation. 

The psychological situation of the patient in 
the clinic is so different from that of the elemen- 
tary psychology student taking a test as part of 
an experiment that it is possible that the kinds of 
predictions which can be made in one situation 
would hardly hold in the other. The evidence 
that faking is possible and that different norms 
obtain when subjects are job applicants, employ- 
ees, or volunteers does not necessarily mean that 
a test is no good. Nor is prediction essentially 
hopeless because it can be demonstrated that two 
experimenters, whether the same sex or opposite, 
or slight changes in the wording of instructions, 
will differentially affect test or experimental re- 
sults. All of these things indicate only that the 
psychological situation, perhaps acting primarily 
through the expectancies they arouse by the cues 
present, considerably affect behavior. It is neces- 
sary that we do not consider such influence as 
error to be ignored, as difficulty to be avoided 
or as the problem of some other profession to 
investigate. Rather it is necessary to study these 
influences and consider them regularly and sys- 
tematically in a predictive schema. That is, for 
some purposes, factors such as social desirability 
of items, examiner’s behavior, and the subject’s 
goals in the test situation should be controlled, 
and in other cases they should be allowed to vary. 
In all cases, however, they must be systematically 
considered, 

Implicit in this entire paper is the belief that 
a satisfactory theory of goal directed behavior is 
a primary prerequisite for developing adequate 
tests. Knowledge of statistics and test construc- 
tion procedures can be valuable, but they cannot 
supplant an adequate theory of behavior which 
is applied to the test taking behavior itself. 

To arrive at a fully systematic model for re- 
lating these general or high order constructs and 
to coordinate them in turn to lower level sets of 
content variables, devised for different purposes, 
will be a long and arduous but rewarding task. 
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SECTION IV 


Personality 
and 


Development 


If personality characteristics play a role in de- 
termining behavior, then, it would seem most 
relevant and interesting to conjecture about and 
empirically study the development of these char- 
acteristics. Four aspects of personality develop- 
ment are explored in the articles in this section. 

The two papers by Barker and Wright and by 
Sears both are concerned with the personality de- 
velopment of the child, but these investigators 
attack the problem with different, complementary 
methodologies. Barker and Wright are inter- 
ested in developing field observational techniques 
which can be employed in describing the child’s 
ongoing response patterns in his natural environ- 
ment. The potentiality of this descriptive ap- 
proach to development has aroused much interest 
on the part of child psychologists, and, particu- 
larly, those concerned with personality develop- 
ment. 

The Sears article reflects a growing interest 
within the field of child psychology in the experi- 
mental study of children’s behavior. Although 
possibly lacking in some of the flavor of the “real 
life” situation, the experimental approach pro- 
vides for the possibility of careful manipulation 
of particular variables of interest to the experi- 
menter. 

In view of the fact that the behavior of children 
has only relatively recently been subjected to sci- 
entific study, it is not surprising that its interpreta- 
tion often rests more on the preconceptions and 
biases of the interpreter than on factual evidence. 
Stevenson presents an interesting contribution 
relevant to the assumption of the great flexibility 
of the infant and child. His discussion of this 
assumption which he presents in the light of avail- 
able information, provides a sound argument in 
support of the use of an empirical approach to 
development from infancy through adulthood. 

One way in which this approach might be im- 
plemented would be to systematically observe and 
study groups of individuals from infancy through 
adulthood or for certain lengthy time periods. 
The many practical problems associated with such 
an effort has limited the number of longitudinal 
studies carried out by psychologists. 
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Kelly’s longitudinal research is doubly valuable 
because not only does it represent a contribution 
to the difficult longitudinal study of human be- 
havior but it does so for a period of life, adult- 
hood, concerning which little developmental 
knowledge is available. In his article Kelly pre- 
sents provocative material concerning the ques- 
tion of the consistency with which the adult per- 
sonality manifests itself over time and the extent 
to which personality patterns become modified 
during adulthood. 


PSYCHOLOGICAL ECOLOGY 
AND THE PROBLEM OF 
PSYCHOSOCIAL DEVELOPMENT * 


ROGER G. BARKER AND 
HERBERT F. WRIGHT * 


Psychosocial development occurs as the per- 
son behaves in life situations. One method of 


studying the person-situation-behavior variables 


that are relevant to psychosocial development is 
the straightforward one of describing the behavior 
of persons in naturally occurring situations. 
Nonexperimental investigation of this kind has 
been productive in astronomy and the earth and 
biological sciences. The literatures of these sci- 
ences record many practical and theoretical 
achievements that are based upon naturalistic de- 


* Reprinted by permission from Child Develop- 
ment, September, 1949, Vol. 20, No. 3, 131-143. 

1 This paper is an expansion of one given at the 
Denver symposium by one of the authors. The paper 
has been prepared by the authors named, but it repre- 
sents the thinking of the whole staff of the Midwest 
Child Study Project. The staff includes the authors 
and Phil Schoggen, Jack Nall, Louise Barker, Lorene 
Wright, Beverly Fox, Lucille Johnson, Maxine Schog- 
gen, Irene Nall and Mariana Remple. This project 
is being supported by a research grant from the Divi- 
sion of Research Grants and Fellowships of the Na- 
tional Institute of Health, U. S. Public Health Service, 
and by the University of Kansas. We are greatly in- 
debted to the people of Midwest for their understand- 
ing and help. 


scription. In the literature of biology, for ex- 
ample, there are many detailed, concrete descrip- 
tions of the conditions of life and the structural 
and functional adaptations thereto of plants and 
animals. Descriptive ecology is an important 
part of biology and an important source of both 
practical information and scientific knowledge. 

Psychology early became experimental; it has 
accomplished little toward the development of 
field-study methods. There are few scientific 
records that tell of a human mother caring for 
her young. There are few descriptions that give 
an account of how a particular teacher behaved 
in a classroom and of how the children reacted. 
We do not know with scientific accuracy what the 
members of some one family actually did and 
said during a meal. We cannot find a detailed 
history of any child from the time he awoke one 
morning until he went to sleep that night. Ecol- 
ogy is scarcely a recognized branch of psychology. 

The near absence of ecological techniques and 
data in psychology precludes the investigation of 
certain problems having an important bearing 
upon psychosocial development, and because of 
this deficiency the study of other problems is 
greatly retarded. It is important, therefore, that 
the meaning of ecology and its place in psychol- 
ogy be made explicit. 


PSYCHOLOGICAL ECOLOGY 
The Greek root of the word ecology means 


home or homeland. For biologists, the study of 


ecology is concerned with the naturally occurring 
biological homelands or habitats of plants and 
animals—particularly with the relations between 


habitat_and function, population characteristics, 
and structural change. 


If we take the terms ecology and habitat into ` 


psychology and follow biological usage, the first 
question is this: What is the homeland, the habitat 
of behavior? It is right here that a disagreement 
has already arisen in the use of these terms. 
One homeland of behavior is the world as it 
exists for the person and as it affects behavior, 
the psychological situation or life-space. The 
other homeland of behavior is the impersonal 
world which does not affect behavior directly, yet 
both limits the psychological world and contrib- 
utes to its content. , This other homeland of be- 


havior is the world of physical and geographical 
conditions, of social groupings and interactions, 
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of economic and political processes, of institu- 
tions, of prevailing ideological patterns; it is the 
non-psychological milieu which surrounds the 
psychological world and from which in part psy- 
chological habitats are made.) 

Lewin (3) has used the term psychological 
ecology to refer to the relations between these 
two realms, the psychological and the non-psycho- 
logical worlds. In a study of food habits, for 
example, he points out how the psychological de- 
terminers of this behavior are directly influenced 
by economic, agricultural, geographic and other 
non-psychological conditions. The problem of 
establishing the connections between non-psycho- 
logical facts and psychological situations or habi- 
tats Lewin calls the ecological problem. 

Brunswick (1) uses the term ecology to denote 
the body of problems concerned with the de- 
scription of the psychological world and its re- 
lation to behavior. Ecological variables for 
Brunswick are the perceived size and distance of 
objects, not their measured size and distance; or 
the judged intelligence and political conservatism 
of persons, not their measured intelligence and 
conservatism. 

Both Lewin and Brunswick agree in using 
ecology and habitat to refer to naturally occur- 
ring, non-experimental situations. There are 
good reasons for continuing this usage and for 
giving the term ecology in psychology a very 
general referent, namely: the study of behavior 
in naturally occurring situations. This marks off 
ogy, with the latter defined as the study of be- 
havior in artificially arranged situations, whether 
these include brass instruments o: i ike 
projective tests and questionnaires. 

Following Brunswick and Lewin further, we 
can distinguish two ecological problems and two 
corresponding terms to identify them. 

First, there is the problem of the non-psycho- 
logical milieu, of the material-cultural world in 
which the person is immersed, and of the way it 
is_transformed into a psychological world or 
habitat. 

There is the further problem of the psycho- 
logical habitat, of the naturally occurring psycho- 
logical world, one that depends partly upon the 
non-psychological milieu and partly upon the 
motives and abilities of the person. The require- 


ment here is to describe existing habitats and to 
show how they are related to behavior. 

Both of these problems are of fundamental 
significance for psychosocial development. 


PSYCHOLOGICAL HABITAT 


We have good techniques for describing and 
measuring behavior, but inferior means of de- 
scribing and measuring psychological habitat. 
One of the first needs here has been the need 
for more adequate concepts. In the absence of 
such concepts we resort to short-hand ways of 
denoting different habitats, generally in non- 
psychological terms, as when we point to the 
“situation” of the “only child,” the “lower class 
child,” the “Negro child,” the “urban child,” the 
“institutionalized child.” It is true that biologists, 
too, use non-biological terms to designate animal 
and plant habitats, as when they refer to the 
Alpine, Hudson, and Sonoron zones. Biologists, 
however, know a great deal about the biological 
complexes, the manifolds of temperature, pre- 
cipitation, soil conditions, and the like, which 
these terms designate. But psychologists often 
have only the vaguest understanding of the psy- 
chological conditions named by their habitat 
terms. 

When the needed information becomes avail- 
able and we have psychological habitat maps, as 
we now have biological habitat maps, a number 
of approaches to the study of psychosocial de- 
velopment will become easier. Important possi- 
bilities here include the following. 

1) Comparative studies of habitat and be- 
havior at different times and places. One cannot 
investigate the changes which have occurred dur- 
ing the last century in the rearing and behavior of 
children. No scientific records have been kept. 
The best one can do is to call upon the writings 
of novelists, diarists, letter writers and news re- 
porters, and from them reconstruct in some de- 
gree the conditions of the past. This is almost as 
true for intercultural and subcultural research.? 
There are few descriptions of the conditions of 
life and behavior of particular individuals in dif- 
ferent cultures which are suitable for psychologi- 
cal analysis. It is virtually impossible, moreover, 

2The work of sociologists and ethnologists deals 
largely with the generalized habitats and behavior of 
the people of a culture or subculture. We are con- 
cerned, here, with the habitats and behavior of par- 
ticular individuals. 


128 


for an investigator to cover the ground of avail- 
able studies from the original data collection 
through the stages of analysis and interpretation. 

2) Studies of acculturation and personulity for- 
mation. Acculturation occurs and personality is 
formed through the interaction of psychological 
habitat and behavior as they occur in life situa- 
tions. The tasks of the student of acculturation 
and personality are to formulate the general laws 
by which psychological habitat is related to accul- 
turation and personality formation, and also to 
discover and describe existing psychological habi- 
tats. The whole problem is one that cannot be 
met in laboratories or clinics; it requires first-hand 
descriptions of behavior and conditions as they 
occur in real life. At the present time we have 
little scientific information on this matter. We 
know much about what children do in the labora- 
tory, under experimentally induced frustration and 
conflict, for example; and we know much about 
what children do in the situations created by 
psychological tests. But we know little about the 
content, order, and patterning of the situations 
which actually exist for children in their daily 
lives; nor do we know how children react to these 
situations. We know a great deal about how chil- 
dren are able to behave, but we know little about 
how they do behave because we have not studied 
what they do in their psychological habitats. 

3) Theoretical studies. The lack of naturalistic 
records of behavior has retarded systematic, theo- 
retical studies. It is well known that important 
theoretical achievements have been made in such 
non-experimental sciences as astronomy, geology, 
and meteorology. It would be unfortunate if psy- 
chology should neglect at this stage in its devel- 
opment the naturalistic approach to problems of 
explanation, for psychology is at present much 
better equipped than formerly to handle natural- 
istic data on an adequate conceptual level. The 
central task of theoretical psychology is to formu- 
late the general laws governing the simultaneous 
and sequential interdependencies of behavior and 
situation. Methods of recording the simultaneous 
and successive behavior manifolds in naturally oc- 
curring situations are essential to progress in this 
difficult task. For one thing, it is impossible to 
create experimentally some conditions that occur 
in life situations. No laboratory can duplicate 
the frequency of repetition, duration, intensity or 
complexity of psychological situations and be- 
havior that are common outside the laboratory. 
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Further, a great variety of psychological habitats 
occur in every community, so that it is possible 
to test hypotheses by comparing the behavior of 
persons in different habitats. The proposed pro- 
cedures are similar to well established public health 
methods in which health and growth are studied 
in relation to naturally occurring nutritional and 
sanitary conditions. 


NON-PSYCHOLOGICAL MILIEU 


We turn to the other main problem of psycho- 
logical ecology, the relation between psychologi- 
cal habitat and the non-psychological milieu. How 
and in what degree does the material-cultural 
world affect psychological habitat? 

Through studies in Midwest, U.S.A., we have 
found that this community, like every other, pro- 
vides for its children a limited number of psycho- 
logical habitats, just as a geographical area pro- 
vides a limited number of plant and animal habi- 
tats. How does this come about? In a consider- 
able degree through manipulation of the non- 
psychological milieu. It is common to say that 
the world of the child, his habitat, arises through 
learning. Our studies already suggest, however, 
that habitats are created or destroyed in a much 
more direct and immediate way, i.e., through 
manipulation of the non-psychological milieu. 
For example, the children of Midwest have few 
regularly recurring habitats which mean to them, 
“this is a time and place to have a good time with 
grown-ups’ help.” There are few adult-created 
recreational groups—no hobby clubs, no dancing 
classes, no purely social organizations for children 
or adolescents. Adult-sponsored children’s activ- 
ities are chiefly cultural and educational, and sheer 
recreation as an adjunct, e.g., the programs of 
the 4-H Club, the Sunday schools, and the Scouts. 
The scarcity of purely social-recreational habitats 
for children in Midwest is not a consequence of 
learning by the children that they are “bad” and 
therefore to be avoided; this is, in fact, not true. 
Their scarcity is an outcome of the fact that the 
community does not provide the milieu so that 
many regularly recurring habitats of this kind can 
exist for its children. The town uses its re- 
sources for the psychosocial development of chil- 
dren more in other ways. 


METHODS OF PSYCHOLOGICAL ECOLOGY 


In the space available, it is possible to consider 
briefly only two methodological problems of psy- 
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chological ecology, both of them concerned with 
behavior and psychological habitat. We refer to 
(1) methods of sampling, and (2) methods of re- 
cording.? 

Sampling Behavior and Psychological Habitat. 
— The primary data of a study in psychological 
ecology are behavior-habitat units. One of the 
first findings of the Midwest research was that 
great numbers of such units occur in even a small 
community. Midwest has a total population of 
700, of which 109 are children below the age of 
12 years. Our records indicate that, considering 
only the children, there occur in Midwest 75,000 
to 100,000 differentiable behavior-habitat units, 
or episodes, each day. It is clear that means have 
to be found for sampling so great a mass of ma- 
terial. 

Two kinds of sampling guides are available. 
First, there are stable, non-psychological charac- 
teristics of individuals that are known to be cor- 
related with behavior. These iaclude sex, age 
and social class. Factors associated with varia- 
tions in these characteristics influence behavior in 
more or less characteristic ways. Secondly, there 
are stable, non-psychological situations of the com- 
munity which are known to be correlated with be- 
havior, e.g., the situations provided by the Sun- 
day school, the day school, a basketball game, the 
drug store. Such community situations coerce the 
children who enter them to behave in relatively 
homogeneous ways regardless of the individual 
characteristics of the children. The behavior of 
children in Sunday school will be quite different 
from the behavior of the same children in day 
school, in the drug store, or at the basketball 
game. Every community provides a number of 
such situations; we may call these the behavior 
settings of the community. 

The student of behavioral ecology is in a posi- 
tion to make a two-way stratified sampling of 
behavior-habitat units if he has available a census 
of the individuals of the community with their 
age, sex, and social class, and a map of the be- 
havior settings of the community. It is unneces- 
sary to consider further here the variables of age, 


8 Other problems of method that cannot be con- 
sidered here are 1) methods of identifying and de- 
scribing the non-psychological milieu, 2) methods of 
defining and identifying valid units of behavior and 
habitat, 3) methods of describing behavior and habi- 
tat in valid conceptual terms. These problems will 
be considered in subsequent papers. 


sex and social class; however, behavior settings 
require additional discussion. 

The behavior settings of a community form a 
link between the non-psychological milieu of a 
child and his psychological habitat. Every be- 
havior setting contains non-psychological forces 
which compel the child to develop a psychologi- 
cal habitat that is in some degree appropriate to 
or isomorphie with it. A behavior setting may be 
defined as a physical or social part of the non- 
psychological world that is generally perceived as 
appropriate for particular kinds of behavior. The 
behavior settings of a community are the limited 
number of physical-social regions which are se- 
lected from the infinite number that could be dis- 
criminated and which are perceived as possessing 
appropriateness as centers for particular kinds of 
behavior. 

A behavior setting is denoted both by its ob- 
jective physical-social characteristics and by -its 
perceived behavioral possibilities. Thus, the be- 
havior setting, “chair,” has both certain physical 
properties (including shape and size) and per- 
ceived behavior appropriateness (a place to sit). 
A person may in this case engage in the appro- 
priate behavior in the absence of a chair. He 
may sit on a stone; but this does not make the 
stone a “chair” for the person, if his perceptions 
are normal. An object with the physical proper- 
ties of a chair which, as in some oriental cultures, 
is not perceived as being appropriate for the be- 
havior of sitting is not a “chair.” 

Similarly, a social behavior setting requires both 
a certain social arrangement and a particular per- 
ception of its behavior appropriateness. Thus, 
the behavior setting, “birthday party,” must in- 
clude both an objective social arrangement (invi- 
tations, guests, refreshments) and characteristics 
of the situation which are perceived as appropri- 
ate for “party” behavior (gifts, congratulations, 
games). Either, alone, does not make a birthday 
party. i 

The behavior settings of a community are partly 
created by the people, as in the instances of 
streets, churches, and policemen. In part, too, 
they are selected from the infinite number of 
conditions and states of affairs, such as bathing 
beaches, quarries, and social classes, that make 
up nature and society. The behavior setting has 
the same place in the psychology of action as a 
color chart in the psychology of perception. To 
be “red” on the color chart an area must have a 
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certain physical wave length of reflected light plus 
the perception of it by most people as “red.” If 
either of these dimensions of the area should 
change, the color would not be “red.” 

Some physically discriminable parts of the en- 
vironment are not behavior settings because they 
are not generally perceived as being appropriate 
for behavior of any kind. Thus, an abandoned 
railroad grade near Midwest is not a behavior set- 
ting. It is not seen as appropriate for any activity: 
not for gardening or farming (poor soil and to- 
pography); not for sliding (banks too short); not 
for hiking (it goes nowhere); not for picnicking 
(no shade). 


Again, the behavior setting stands midway be- 
tween the non-psychological milieu and the psy- 
chological habitat. It is a part of the non-psycho- 
logical world that is generally perceived as having 
some particular behavior appropriateness. The 
behavior setting is not a part of the non-psycho- 
logical milieu as such, nor a part of the milieu 
which has been transformed into psychological 
habitat for a particular individual; it is a part of 
the milieu as it is generally perceived.* 

In practice, behavior settings can be discovered 


by listing the distinguishable physical and social” 


+ Sociologists and ethnologists are concerned with 
this problem on a culture-wide basis. 


EXAMPLES OF CHILD BEHAVIOR SETTINGS IN MIDWEST 


Physical-Social Milieu 


Own home, 6:00-8:00 a.m.; parents call, child 
undressed, clothes available, bathroom facilities 
available. 


Own home, indoors; no specific coercions operat- 
ing. 


School classroom, “study period.” 


School halls and cloakrooms, before school and 
between classes. 


Drug store. 


Sunday school, opening exercises. 
Women’s Club. 


Streets (except those around courthouse square). 


Hardware store. 


Movie. 


Tavern. 


Behavior Appropriateness for Grade-School 
Child 


“Getting up and getting dressed.” 


Wide variety of quiet, non-destructive “proper” 
activities (playing, reading, listening to radio, 
etc.). 


Sitting, quiet shifting of position, writing, reading 
to self, thinking, day dreaming, quiet talking with 
neighbors about lessons, walking about with per- 
mission. 


Restrained talking, joking; walking ubout freely; 
eating candy. 


Purchasing and eating candy, soft drinks, ice 
cream; purchasing and/or reading comic maga- 
zines: meeting and conversing with friends; mild 
hilarity. 

Sitting still, listening, group singing. 


Helping mother serve, quiet sitting and listening, 
performing (speaking piece, singing). 


Walking, running, riding horse or bicycle, playing 
ball. 


Examining and buying goods for sale, securing 
information about the operation of tools and 
machines, watching mechanics, playing in small 
groups among refrigerators, etc. 


Eating popcorn, meeting friends, watching show. 
Looking in. 


ee 
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features of a community, and by asking the in- 
habitants, “What is the appropriate thing to do 
here?” This question may be asked directly of 
informants or it may be asked and answered im- 
plicitly by observing the characteristic behavior 
of people in different physical and social regions 
of the community. 

Examples of behavior settings for the children 
of Midwest are given on p. 130. They have been 
identified upon the basis of information secured 
from informants, reports in the Midwest Weekly, 
and field observations. The breadth of the cate- 
gories has been determined by the requirements 
of our research; for some problems, considerably 
more detail would be required. The psychosocial 
development of the children of Midwest takes 
place within such settings as these. The complete 
inventory of behavior settings serves as a guide 
for sampling the psychological habitat and be- 
havior that occur in the community. 

Recording Psychological Habitat and Behavior. 
—Scientific descriptions by psychologists of human 
behavior in naturally occurring situations have 
used four general methods, which are represented 
by 1) specimen records, sometimes called “diary 
records,” “narrative accounts” or “anecdotal rec- 
ords”; 2) time sample records; 3) surveys and 
tabulations; 4) case studies and biographies. Each 
method has advantages for ecological studies. It 
is not possible to consider all of the methods here. 
We should like, though, to indicate the place of 
specimen records in ecological research. 

In their original form, specimen records were 
simple narrative descriptions of behavior. Exam- 
ples are given below: 


Preyer (4): On the 26th day... He 
started suddenly when a dish that he could not 
see was noisily covered near him. He is 
freightened, then, already at unexpected loud 
noises, as adults are. On the thirtieth day this 
fright was still more strongly manifested. I 
was standing before the child as he lay quiet, 
and being called, I said aloud, ‘Ja. _ Directly 
the child threw both arms high up quickly and 
made a convulsive start with the upper part of 
his body, while at the same time his expression, 
which had been one of contentment, became 
very serious. 

Trades (2): Phineas (3:10) went to look, at 
his garden and found the mauve crocus which 
he had planted last week; he showed it to Miss 
C. saying, “Jt’s come out—that’s because 
made it some pudding.” He wanted to plant 
more bulbs and went to the rubbish heap to 
look for more. There were two crocus open 


and he began to dig them up. The stems broke 
and they came up without the bulbs. Phineas 
said, “Are they broken?” and Lena told him, 
“Yes because there aren't any bulbs.” ... 
Later Phineas “dug” one up complete with bulb 
and shouted, “There’s the root . . . That one 
will grow!” He planted it in his own plot. 


Records of this kind have been severely criti- 
cized, and it is partly because of them that the 
term “anecdotal” has become one of opprobrium 
in psychology. The criticisms which have been 
mentioned most often are 1) biased selection of 
incidents, 2) unreliable reporting, 3) unwarranted 
interpretation, and 4) difficult recording and anal- 
ysis. These criticisms are undoubtedly justified 
in many cases in which primitive specimen rec- 
ords have been used. However, the reading of 
such anecdotes, even in their early inadequate 
form, reveals advantages which have not often 
been pointed out. They give a multi-variable pic- 
ture of the molar and molecular aspects of be- 
havior and situation. They record in some meas- 
ure the continuities of behavior. They present 
unanalyzed specimens of behavior and psycho- 
logical situation. Other sciences have found it 
profitable to collect relatively intact, unanalyzed 
specimens of their phenomena for study, as in 
herbia, and in anatomical and geological mu- 
seums. Psychology has no such collections of 
its raw material. If we want undissected records 
of how and under what conditions people actu- 
ally behave we have to go to novelists, diarists and 
newsreporters. 

There is some evidence of a new interest in the 
possibilities of such naturalistic records of be- 
havior which will meet present-day requirements 
of reliability and validity. Despite the discour- 
aging shortcomings of early anecdotal reports, 
there is reason to believe that such descriptive ac- 
counts can be greatly improved. Here is an ex- 
cerpt from a record that we have made. It is 
taken from a continuous narrative record of a 
child’s behavior from the time he awoke in the 
morning until he went to sleep at night. It ex- 
tends from 7:20 a.m. till 9:15 p.m. 


Roy is a seven-year-old boy. He lives in 
Midwest with his mother and father and two 
sisters (15 and 13 years) and brother (12 
years). The following sequence of behavior 
occurred in May, 1949. It took place after 
Roy had finished breakfast and before he left 
for school. Roy was dressing in his bedroom, 
which also serves as a hallway between the 
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kitchen and the living room. He stood in his 
shorts beside his bed, where lay his clean jeans. 

8:33 Roy took a pair of neatly pressed jeans 
off the bed and viewed them soberly. 

He sat on the bed as he put his legs into the 
legs of the jeans. He was puffing with the 
effort of dressing, and probably because he had 
a cold, too. 

One of the snaps on the fly wouldn’t snap, 
though he tried hard, frowning impatiently. 

He said, “Ounnnn,” in protest and looked at 
me seriously for me to share this reaction to 
difficulty. 

I said, “It’s hard.” He said, “Urrrr,” softly 
in agreement. With another stronger effort he 
succeeded in fastening the snap. 

He showed no sign of satisfaction, however, 
as his attention quickly and eagerly turned to 
a belt on the bed. The belt seemed to be the 
os ha thing, but the jeans had had to go on 

st. 

He straightened the belt, but was not ready 
to put it on. 

8:34 He announced, “I’m going to take my 
belt,” as though this were something very im- 
portant. 

Then he put on the belt. This completed the 
dressing. 

Roy said proudly and with definiteness, “I’m 
going to take my gun, too.” He looked at me 
and said, “The gun is still under my pillow.” 
He then looked toward his pillow. 


His tones and glance suggested that there 
was prestige about sleeping with a gun 
under the pillow. However, there was no 
arrogance, just happiness. 


He reached under the pillow. 

To his evident surprise, he brought out a 
flashlight. 

He said, “Abhh, here’s the flashlight, I didn’t 
put it up yet.” He laughed as though he had 
been quite inefficient, that he should have done 
that some time before. 

He took the flashlight, and laid it carefully 
on a ledge of the nearby cupboard. 

8:35 Lola, his fifteen-year-old sister, called 
pleasantly, but with responsibility, “Are you 
about ready to go to school, honey?” 

As Lola came in, Roy got his gun and holster 
from under the pillow. Lola went directly to 
Roy and turned his collar down. 

She said, asserting her responsibility, but in a 
friendly voice, “Collars look better when they 
are turned down.” 

She kneeled and turned up Roy’s jean legs so 
they wouldn’t be too long. She was quite 
motherly about this. Roy was absorbed with 
his gun and holster, paying little attention to 
his sister. 

Then Lola noticed Roy’s holster and gun and 
said, “You shouldn’t wear that holster and the 
gun,” in an authoritative tone. 


He ignored her remark and quietly persisted 
in preparing to wear the gun. He was getting 
the holster placed just right on the belt. Then 
he took off a chain that was on the holster. 

He put the chain on the bed. à 

His sister said, “You can’t wear it.” Then 
as a more forceful approach, “Mother doesn’t 
want you to wear it.” She said this- firmly and 
with agitation. 

Roy showed no intention of heeding her. He 
adjusted the holster even more; then put the belt 
around him and began to fasten it. 

8:36 His sister exclaimed quite loudly as 
she saw him actually put the belt on, “Roy, you 
take it off!” 

Roy stood resolute and very quiet. 

This did not appear to be an ominous quiet- 
ness perhaps to be followed by tears or tantrum. 
I think Roy had decided he didn’t necessarily 
win by argument or pleading a cause, but that 
it is more prudent to act quietly, avoiding inter- 
ference. It was a technique, it seemed, for 
coping with a sister who would try to prevent 
him from doing what he wanted to do. Lola 
did not show hostility during the argument. 
She meant to be firm because she was responsi- 
ble for his dressing, but she did not show much 
agitation until she saw him actually putting the 
belt on. 

She seemed finally to be almost ready to con- 
cede the point, and said, “Do you play guns at 
recess?” 

He said, “Yes,” emphatically and also de- 
fensively, looking straight ahead. 

Vernon (the twelve-year-old brother), who 
came in to hear the argument, said, knowingly 
and in firm contradiction, “Oh, no you don't.” 
Lola said uncertainly and still trying to reckon 
with Roy, “Well, will it be warm enough to- 
day?” 

I took it the argument was over and Roy had 
won. 

The sister said, “I'll wash you.” She started 
for the bathroom, expecting Roy to follow. 


Records such as this, presenting as they do 
relatively complete accounts of behavior and psy- 
chological habitat as they naturally occur, appear 
to be of great potential value for studies of psy- 
chosocial development. Their validity, reliability 
and practicability remain to be determined, al- 
though there is little question that they are ade- 
quate for many problems. 

Through our work in making specimen records, 
we have learned a number of things which seem 
to be of methodological significance. 

1. At the present time specimen records must 
be made by watching and verbally describing the 
behavior of the subject. Sound moving pictures 
or television may become technically feasible in 
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the future, although the difficulties appear to be 
very great. 

2. Records covering relatively long sequences of 
behavior are advantageous. In the first place, 
the danger of biased selection of behavior inci- 
dents is reduced as the length of the sequence is 
increased. We find that a record covering the 
behavior of a child through an entire day con- 
tains 750 to 1,000 behavior-habitat units or epi- 
sodes. A long rather than a short sequence of 
behavior is, therefore, roughly equivalent to a large 
rather than a small population sample. In the 
second place, the danger of false interpretation 
of the meaning of an episode of behavior is re- 
duced when it occurs in a wide context. In the 
third place, behavior is extended in time, and its 
sequential arrangements are as significant as its 
other features; continuity as an aspect of behavior 
is lost if single episodes, or short sequences are 
recorded. 

3. Specimen records should include descrip- 
tions of both the directly perceived, manifest be- 
havior—the vocalizations, the limb movements, 
the locomotions—and the observer's immediate in- 
ferences of the motivations and feelings of the 
subject. One who is concerned about the objec- 
tivity of such records must face the fact that he 
has to deal with the direction, the goals and the 
meaning of behavior when he studies psychoso- 
cial development, and that these can never be per- 
ceived directly. By recording long sequences of 
both directly observed and inferred behavior the 
best basis for constructing a final “true” record is 
laid. Observation is always selective, but it 1s 
more or less so, and it can be guided by broad or 
narrow purposes. One aim in training observers 
to make specimen records is to reduce the bias 
and increase the breadth of the report. 

4. A specimen record should describe the situ- 
ation in which the behavior occurs as well as the 
behavior itself. Behavior divorced from its psy- 
chological context is Jargely meaningless, and 
much of the value of specimen records lies in the 
possibility of studying the relations between situ- 
ation and behavior. 

5. In making specimen records the influence of 
the observer must be kept minimal and constant. 
The observer is almiost always a part of the sub- 
ject’s psychological situation and hence influences 
his behavior in some degree. This is an inevitable 
limitation of most naturalistic observations. An 
important part of the training of observers lies in 
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techniques of minimizing and holding constant 
the involvement of the observer in the subject’s 
behavior. 

6. The optimal length of an observation period 
is small. The alertness required to perceive and 
remember the multitude of simultaneous and se- 
quential occurrences is fatiguing. Notes taken on 
the spot help, and they are a part of our proce- 
dure. We have found, however, that even when 
extensive notes are taken, an observer's efficiency 
rapidly declines after 30 minutes of observation. 
This means that long, continuous, specimen rec- 
ords require a team of observers. For a full-day 
record at least six observers are needed, with eight 
or nine desirable. 

7. Observations should be recorded by dicta- 
tion immediately after they have been made. It 
is probable that when the intention is to make 
ratings and general summaries of behavior and 
situation, a period of time to provide for perspec- 
tive and insight is advantageous. But when em- 
phasis is placed upon the concrete details, as in 
specimen records, immediate recording is essen- 
tial. 

8. We have found it valuable to provide an 
interrogator who listens to the original dictation 
of observations, and questions the observer on un- 
clear points after he has finished. This allows for 
both spontaneity in the original report and for its 
subsequent correction and completion. It usually 
requires at least an hour to record a half-hour 
observation; frequently a longer time. With a 
team of eight or nine observers a schedule can be 
arranged whereby each observer serves also as in- 
terrogator. Interrogation benefits both the ob- 
server’s record and the interrogator’s subsequent 
observations. 

9. The typescript of the original report and in- 
terrogations can conveniently be stapled in se- 
quence to wide continuous strips of paper for 
editing by the original observer. The editing in- 
cludes correcting, simplifying and welding the re- 
port and the interrogation into a final sequential 
description. In our work, the observer is always 
interrogated further in this stage of the procedure. 

This process yields a record of the sort exem- 
plified by the excerpt given above. It provides a 
specimen of raw behavior data which van be used 
for a number of purposes and which appears to be 
of particular value for studies of psychosocial 
development. The specific uses depend upon the 
methods which can be devised for analyzing and 
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conceptualizing the descriptive content of the rec- 
ords. This problem, together with the problems 
of reliability and validity, will be considered in 
forthcoming publications. 

Psychological ecology can make important con- 
tributions to an understanding of psychosocial de- 
velopment, These contributions await the further 
development of methods of studying behavior in 
naturally occurring situations. 
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EFFECTS OF FRUSTRATION 
AND ANXIETY ON 
FANTASY AGGRESSION * 
ROBERT R. SEARS + 


So far as society at large is concerned, there 
are probably few matters of as little immediate 
importance as the doll play of three-year-old chil- 


* Reprinted by permission from The American 
Journal of Orthopsychiatry, July, 1951, Vol. XXI, 
No. 3, 498-505. 

1 The data here reported are from a research proj- 
ect on the personality development of young children 
conducted at the Jowa Child Welfare Research Sta- 
tion with the aid of a grant from the Rockefeller 
Foundation. The present study is a product of the 
project staff as a whole; the writer is simply a re- 
porter. Other members of the research staff who 
contributed directly were Lois Jean Carl, Helen 
Faigin, Eleanor Hollenberg, John McKee, Vincent 
Nowlis, Pauline S. Sears, Margaret Sperry and John 
W. M. Whiting. 


dren. Playing with dolls is a childish amusement, 
a delight for the carefree heart. It belongs to the 
young, who have no responsibilities—the young, 
for whom pain seems so quickly passing. Adults 
who must cope with the gut-constricting experi- 
ences of a real interpersonal world view child- 
hood with patience and, too often, a mild con- 
tempt. Children are second-class citizens, and the 
things they so characteristically do are second- 
class things. Playing with dolls is one of them. 

Such behavior is by no means second-class from 
the standpoint of psychological theory, however. 
Since those earliest sensitive observations of doll 
play by Anna Freud, psychoanalysts and experi- 
mental child psychologists alike have recognized 
that such play provides a kind of microcosmic 
laboratory for the study of motivation in its early 
stages of development. 

Doll play is a form of fantasy living, behavior 
influenced in part by external things but without 
so many of the rules and restrictions that control 
real interpersonal activity. Fantasy is not cate- 
gorically separable from other forms of behavior; 
it is a product of the same child, with the same 
habits and motives and the same potentialities 
for response, who lives and acts in the nonfantasy 
world. But fantasy occurs when some of the laws 
of the physical and social universes have been 
rescinded, and hence it is the product of the 
child’s drives and habits when these have differ- 
ent constants in the behavior equation. 

From a clinical standpoint, fantasy behavior 
must naturally be held in high regard. Any 
method of examining a child that provides a 
standard way of changing the weightings of par- 
ticular drives, or other instigators, is a means of 
discovering what those drives are and how heav- 
ily they are weighted in other forms of behavior. 
If it can be shown, for example, that doll play 
reduces aggression-anxiety at a determinable rate, 
the clinician can discover how much aggression 
motivation a child possesses, and how severely it 
is being inhibited, in his daily interactions with 
parents or teachers or other children. 

From the standpoint of psychological theory, 
fantasy has equal significance. Unless we are to 
suppose that the fantasy aspects of a child’s be- 
havior simply do not operate in accord with laws 
established for other aspects, the fantasies exhib- 
ited in doll play provide an easily manipulable 
experimental situation for the discovery and re- 
finement of those laws. It is difficult, for exam- 
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ple, to study the effects of reproving aggression 
in the home itself; in doll play the problem can 
be studied simply and briefly. 

Lest this example seem so simple as to make 
all the possible contributions of experimentation 
appear trivial, a word must be said about the pur- 
pose of experimental research. The aim of genetic 
psychological science is the same as that of any 
other science—to develop predictive accuracy in 
antecedent-consequent relationships. Science gen- 
erally achieves this aim by defining variables 
precisely, discovering relationships among them, 
and combining these principles into as economi- 
cally formulated a theory as possible. In child 
psychology we are a long way from the comple- 
tion of these tasks, and any one decade’s work 
adds but minute fractions to the ultimate theoreti- 
cal framework psychological science is building. 
Attempts to by-pass this aim of science, however, 
to flee from the difficulties of its accomplishment 
by denying its validity, only give the appearance 
of desperate excuses for the slowness with which 
the job is getting done. Let us be patient, and not 
too demanding that the universe be made dynamic 
and wholly understandable in the next fortnight. 
It is more important to find regularities in be- 
havior, to discover antecedent-consequent rela- 
tionships that make some coherent theoretical 
sense, than to satisfy our natural human passion 
to know all, or to understand the uttermost de- 
tails of any one personality, before we die. And 
from the standpoint of psychological science, it 
is infinitely more important to develop principles, 
even very simple ones, that will predict future 
behavior, than to discover descriptive patterns 
that give an illusion of understanding a child’s 
past. By these criteria, an examination of the 
effect of reproving or punishing children’s doll 
play aggression becomes an important matter, for, 
as will be seen, this experiment does provide some 
principles that aid in the prediction of behavior. 

Two major questions can be asked about the 
antecedents of doll play aggression: first, what 
factors determine its frequency, and second, what 
factors determine which doll agents are used for 
portraying it? Data obtained in two studies,” by 
Margaret Sperry and Eleanor Hollenberg of our 
Laboratory staff, have provided some suggestive 
clues that indicate the theoretical framework 
within which answers may be sought. 


2A more detailed report of these investigations is 
published elsewhere (1). 
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The data with which we are concerned are the 
aggressive actions of preschool-aged children in a 
doll play situation. A child was presented with 
a roofless doll house of six rooms containing con- 
ventional furniture; both the walls and furniture 
were movable, Five dolls—father, mother, boy, 
girl and baby—were handed to the child, one at a 
time, and were individually identified for him if 
he did not so identify them himself. The dolls 
were described as the family that lived in the 
house, and the child was asked to tell a story 
about what the family did. He was carefully as- 
sured that he could have them do anything he 
wanted, that there were no restrictions on his 
play. In general, the methodological details of 
the procedure were those previously found to be 
most conducive to the free expression of aggres- 
sion (3). The child was allowed to play for fif- 
teen minutes and then was returned to his pre- 
school group. This procedure was carried out on 
four occasions, two to four days apart. 

Recording of the play was in terms of behavior 
units. An observer behind a one-way screen re- 
corded the agent and object of each action se- 
quence, and the behavior category to which it 
belonged. There were a number of categories, 
including various kinds of aggression, but for the 
present analysis these have been combined in such 
a way as to provide two descriptive scores: routine 
or stereotyped behavior that reflected conventional 
household activities of whatever doll was being 
used (“the mommy is getting breakfast”), and 
aggression (“the daddy spanks the baby”; “the 
girl hides when her daddy calls her”). The ex- 
perimental data, therefore, are in the form of fre- 
quencies of occurrence of these two categories per 
fifteen or per sixty minutes, and the frequency 
with which each doll was used as the agent for 
each kind of act. 

First consider the antecedents of sheer fre- 
quency of aggressive acts. Considerable evidence 
has been amassed in recent years to show that 
aggression is one of the common reactions to 
frustration, though by no means the only one, 
and that the strength or 
gression varies positi 
frustration. 
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operative at the moment, and increases their 
strength. If an aggressive act occurs, its strength 
would then depend in part on how much drive 
value was added by the frustration. 

In order to determine whether the frustrations 
of a child’s home life had this kind of facilitating 
effect on fantasy aggression, it was necessary to 
obtain ratings of the severity of frustration in the 
home. Dr. Vincent Nowlis interviewed the moth- 
ers of thirty children to obtain information con- 
cerning methods of child rearing. A scale was 
constructed for rating the severity of the frustra- 
tions imposed both in infancy and currently. The 
thirty children could then be divided into two 
equal-sized subgroups, those whose mothers were 
rated in the upper half of the group on severity 
of frustration, and those whose mothers were 
rated in the lower half. The hypothesis specifies 
that the more severely frustrated children would 
be the more aggressive. This proved to be true; 
the higher frustration group showed a higher fre- 
quency of doll play aggressive acts. The varia- 
bility was large, however, and the difference was 
not very reliable. Evidently some other factor 
besides frustration was influential in controlling 
frequency. 

The Whiting hypothesis suggests where to search 
for this next variable. Let us assume that every 
child has some initial motivation to perform ag- 
gressively. Punishment of this aggression would 
create an anxiety that would tend to inhibit the 
aggressive behavior in the stimulus setting where 
the punishment originally occurred. That is, if a 
child’s overt aggression were severely punished by 
his mother, that particular kind of aggression 
would tend to be eliminated, or at least reduced, 
when the mother was near. Being aggressive and 
fearing to express aggression creates a conflict, 
however, and this conflict itself would serve as a 
frustration. Such a frustration should increase 
the tota! drive available for instigating aggression. 
In other words, the more anxiety a child feels 
about expressing his aggression, the more instiga- 
tion to aggression he will have. 

The question is, Where will he express this 
aggression? The punishment that created his anx- 
iety occurred at home, and therefore he will prob- 
ably reduce his actual overt aggression there. 
Still, he has an increased drive to express it. Here 
a new behavior process must be taken into account 
—the process of stimulus generalization. When 
the child is placed in the very permissive doll play 


situation, which is only partly like his own home, 
he is being placed in a stimulus situation that is 
far and away from the original reinforcing situa- 
tion on a similarity continuum. By the principle 
of stimulus generalization, the doll setting will 
provide stimulation toward both aggression and 
aggression-anxiety. 

But an important qualification enters here. 
Neal Miller (2) has shown that the anxiety gradi- 
ent is steeper than the aggression gradient. That 
is, anxiety diminishes more rapidly than aggres- 
sion when the mother is absent. The doll house 
is only somewhat like the home, and with these 
differential gradients, the child would be expected 
to show more aggression than anxiety. 

If this principle is combined with Whiting’s 
hypothesis that the conflict itself increases the in- 
stigation to aggression, we are faced with the pre- 
diction that the more severely a child is punished 
for aggression by his mother, the more aggression 
will he show in doll play. 

In order to test this reasoning, the interview 
materials were again examined and a rating scale 
was constructed for measuring the mother’s puni- 
tiveness of aggression. The thirty children were 
then divided into two equal sub-groups, those 
whose mothers were in the upper half, and those 
whose mothers were in the lower half of the puni- 
tiveness scale. The average frequencies of ag- 
gression for the two groups were compared, and 
as predicted, the more severely punished children 
had a higher frequency of aggressive acts. The 
difference is significant at a 5 per cent level of 
confidence. 

It is worth examining the relationship between 
the two scales of frustration and punitiveness. If 
they were highly correlated, the two sets of com- 
parisons just cited would be mainly measures of 
the same factor. In fact, however, the scales 
correlate only .17. The two variables are almost 
completely independent, and it is therefore clear 
that the low reliability of the comparison of either 
variable alone is in part a product of uncorrelated 
variability introduced by the other. 

It is unfortunate that no measures of the chil- 
dren’s aggression at home were available, for it 
will be recalled that the derivation of these find- 
ings rested on the assumption that maternal 
punishment would inhibit overt aggression. The 


3 The lack of these aggression-at-home data is even 
more unfortunate from another standpoint. The pres- 
ent reasoning assumes that both groups, the more- 
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direct effects of punishment have been exempli- 
fied in another experiment, however. 

Another group of twenty-three children were 
divided into two subgroups, equated for age and 
sex. One, the control group, was given four 
permissive sessions as in the previous experiment. 
The other, the experimental group, was treated 
similarly on all but the second session, when each 
expression of aggression was reproved (“Now, 
John, you know nice boys don’t do things like 
that”). 

The two groups expressed an equal amount of 
aggression on the first session, but such behavior 
began to be inhibited in the experimental group 
during the second. On the third session, the ex- 
perimental group performed only a quarter as 
many aggressive acts as the control group, even 
though the experimenter’s attituce was once again 
permissive. On the fourth session this permis- 
siveness began to tell; the experimental group’s 
aggression increased to the level of the control 
group on session two. The difference in fre- 
quency was significant at well beyond the 1 per 
cent level of confidence on the third session and 
just less than 1 per cent on the fourth. 

So much for frequency of aggression. High 
amounts of fantasy aggression are related to home 
frustration and to maternal punishment, while 
direct punishment of fantasy itself reduces fre- 
quency. 

Now let us consider which doll the child uses 
to portray his aggression. Here the problem of 
identification enters. A child’s identification with 
his parents is based in part on the nurturing 
behavior that the parents show toward him. 
That is, how much a child identifies with his 


and the less-punished, were equal in nonparentally 
determined sources of instigation to aggression, and 
that the difference in severity of punishment was the 
crucial variable, It is possible, of course, that the 
more-punished children were more aggressive to start 
with (for constitutional or other reasons), and there- 
fore evoked more severe punishment from their more 
severely irritated parents. If this were the case, the 
amount of doll play aggression would be simply an- 
other measure, in addition to home behavior, of the 
strength of instigation to aggression, and the drive 
produced by conflict would be but a secondary factor 
conducing to the same end results. In other words, 
the present data would be less crucial for the conflict- 
drive hypothesis. Such an explanation would not 
account, however, for the differences between the two 
groups with reference to which doll agents were most 
frequently used, as described later in this paper. 


parents depends on how much affectionate and 
nonpunitive nurturance he gets. The children of 
highly punishing mothers would show least iden- 
tification with them. 

In doll play there is at least one good measure 
of identification, namely, the extent to which the 
child uses the parent dolls for the portrayal of 
routine household activities. In using the dolls 
for this purpose, the child appears to be adopting 
the parental role. He is playing as if he were 
himself the parents, and had their prestige, power 
and care-taking position in the family. 

According to this reasoning, the children of 
more highly punitive mothers should use the 
parent dolls, as agents of routine nonaggressive 
acts, less often than should the children of less 
punitive mothers. The two subgroups men- 
tioned earlier give an opportunity for testing this 
hypothesis. A comparison of the upper and 
lower halves of the distribution of mothers’ puni- 
tiveness shows that the children of the less puni- 
tive mothers do use the parent dolls more for 
routine activities than do the children of the 
more punitive mothers. The difference is at the 
5 per cent level of confidence. 

Now, if it may be hypothesized that there is a 


dimension of degree of identification comparable 
to a dimension of stimulus similarity, the princi- 
ples of generalization should operate with respect 
to identification also. In the doll play perform- 
ance, for example, children should use the parent 
dolls as agents of routine activities more often 
than the child dolls. This proved to be the case. 
The order of mean frequency of use of a doll as 
agent was in the order: parent doll of same sex, 
parent doll of the other sex, child doll of same 
sex, and child doll of other sex.* At the first 
session of these experiments, the parent dolls 
were used, on the average, in about 57 per cent 
of the behavior units categorized as routine and 
nonaggressive. 

The next step is to examine the effects of 
punishment on the expression of aggression. If 
we assume a dimension of identification along 
which generalization can occur, then, just as 
with routine behavior, the most aggression should 
be portrayed through the parent dolls, and the 


4 The baby doll has not been included in this analy- 
sis because certain additional factors such as the 
children’s indecision about its sex, and possibly the 
presence or absence of a baby at home, led to great 
variability in its usage. 
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least through the child dolls. But since aggres- 
sion nearly always gets punished at home, there 
should also be a generalization of aggression- 
anxiety along this dimension. Since this latter 
gradient is steeper, however, the two factors 
should not simply cancel each other out, making 
aggression from all dolls equally likely. On the 
contrary, the difference in steepness should pro- 
duce relatively more child doll aggression. In 
other words, the greater the anxiety, the more 
aggression will be portrayed through the child 
dolls rather than the parent dolls. 

This hypothesis can be tested in two ways, and 
both comparisons support it. First, one can com- 
pare the children of highly punishing mothers 
with those of low punishing mothers. The former 
subgroup showed almost exactly equal use of 
parent and child dolls for portraying aggression, 
while the children of less punishing mothers used 
the parent dolls nearly four times as much as 
the child dolls. The difference is significant at 
the 1 per cent level of confidence. In other 
words, the more anxious children used the parent 
dolls to portray aggression much less often than 
did the less anxious children. 

In the other experiment, involving punishment 
of aggression during the second session, the con- 
trol group increased from 37 per cent to 73 per 
cent usage of the parent dolls for aggression from 
the first to the fourth session, while the experi- 
mental group, which had had its aggression- 
anxiety increased by reproof, increased from 43 
per cent to only 54 per cent. The increase for 
the control group is significant at the 5 per cent 
level of confidence but that for the experimental 
group is entirely nonsignificant. 

One final effect of anxiety remains to be con- 
sidered. In a number of previous investigations 
it has been shown that the total frequency of 
aggression in doll play increases from session to 
session. It has been supposed that this increase 
was a result of decreasing aggression-anxiety in 
the permissive atmosphere of these experiments. 
If this hypothesis is correct, we should expect 
that the ratio of parent doll to child doll aggres- 
sions would increase from the first to the fourth 
session. With both the control group of the 
punishment experiment and the entire group of 
the other study, this is the case, although the 
significance of both differences is a little less than 
the 5 per cent level of confidence. Thus, the 
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dissipation of anxiety not only permits aggression 
to increase from session to session, but it becomes 
more and more expressed through the parent 
dolls. 

These findings, it will have been noted, all re- 
late to the average performance of groups of 
children. They do not apply to individual chil- 
dren. There may be several other variables be- 
sides frustration and punishment that influence 
the frequency of fantasy aggression and the 
choice of doll agents for its portrayal. Certainly 
the large within-groups variability in the present 
studies suggests that this must be the case. If so, 
much more experimentation will have to be done 
in order to isolate these additional variables. 
When that task has been accomplished, the theo- 
retical psychologist will move on to new problems. 

The clinical psychologist, however, will at that 
point be able to make a simple reversal of the 
various antecedent-consequent relationships dis- 
covered, and apply them as principles of diag- 
nosis. That is, he will be able to measure the 
frequency of aggression, and note the relative 
frequency of the different doll agents used in its 
expression, and from those data decide how 
seriously inhibited a child is with regard to ag- 
gression, how much anxiety he feels about: its 
expression, how closely identified he is with his 
parents, and so on. 

In summary, then, it may be suggested that 
the present stage of this research reveals two 
rather interesting clues the diagnostician can use. 
The data suggest that the child who shows a 
great deal of thematic aggression may be not only 
a highly frustrated child, but, contrary to com- 
mon sense, also have been punished a great deal 
for his real life aggressions. In other words, high 
doll play aggression can be a product of parental 
attempts to reduce aggression. Secondly, the 
rapid increase, through several sessions, of the 
use of parent dolls in aggressive sequences sug- 
gests a relatively low anxiety about parental 
punishment. That is, the child who portrays the 
parent dolls as most aggressive is likely to be the 
child whose own parents are actually the least 
punitive or counteraggressive toward him. 
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IS THE HUMAN PERSONALITY 
MORE PLASTIC IN INFANCY 
AND CHILDHOOD? * 


IAN STEVENSON * 


“The doctrines which best repay critical exam- 
nation are those which for the longest period 


have remained unquestioned.” 
—A. N. WHITEHEAD 


No assumption of modern psychiatry enjoys 
greater acceptance than the belief that human 
personality is more plastic in infancy and child- 
hood than in later years. Although widespread 
today, the belief belongs to modern times. The 
writings and practices of ancient Greece and 
Rome showed great concern for the education 
and training of children; and no less for those of 
adults, We cannot say exactly when the modern 
emphasis on childhood training and relative 
neglect of adult training began. In the 16th cen- 
tury St. Ignatius made a clear statement of it. 
He declared that if he could have the teaching 
of a child until the age of 6, he did not care who 
instructed him afterwards. He firmly believed 
that nothing could undo the teachings of the early 
years. From about this time on, a belief in the 
paramount importance of childhood experience 
in the formation of personality forms a central 
doctrine of many systems of psychology (1, 2). 


from The American 


s i i ission 
Reprinted with permissi 1987, Vol. 114, No. 


Journal of Psychiatry, August, 
2, 152-161. A 
"1 Read at the 112th annual meeting of The Ameri- 

can Psychiatric Association, Chicago, Il., Ap 

May 4, 1956. 


139 


A balanced view of the contributions of hered- 
ity and environment to human personality slowly 
emerges in our literature (3, 4). Extreme posi- 
tions in this old controversy no longer appeal, 
and none will be adopted in this paper, for the 
question at issue is not whether environment or 
heredity contributes more to the formation of 
human personality, but whether the contribution 
of environment occurs unevenly. The environ- 
ment plays upon the organism from the moment 
sperm and ovum unite until the end of life. 
A priori, we have no grounds for believing that 
the environment exerts greater force at one period 
than at any other. Some proponents of this be- 
lief have said that the helplessness and necessary 
dependency of the infant upon his parents ac- 
count for his susceptibility to their influence, but 
this is to offer an explanation of an assumption 
rather than a foundation for it. Helplessness and 
dependency do not necessarily render the person- 
ality more malleable if we may judge by the 
behavior of the sick and the aged. 

Diverse observations made over the past 10 
or 15 years throw doubt on the assumption of 
an uneven distribution of environmental effects. 
Taken together they provide a rather formidable 
obstacle to its acceptance. They do not disprove 
the assumption, but they threaten seriously the 
claim that it is already proven. I shall consider 
the data which have brought me to this conclu- 
sion under 4 arbitrary headings, any one of 
which may include material relevant to another 
aspect of the problem. 


REVIEW OF RELEVANT DATA 


CHILD TRAINING PRACTICES AND THE LATER FORM 
OF THE PERSONALITY 


The literature of psychiatry abounds in articles 
asserting causal connections between the early 
experiences of life (especially training practices) 
and the later personality. The far fewer articles 
reporting objective studies of such relationships 
fail to support the assertions made (5, 6, 7, 8, 9). 
Thurston and Mussen (5) contribute a review of 
earlier studies in addition to their own negative 
empirical study. Orlansky (8) and Lindesmith 
and Strauss (9) have reviewed the empirical stud- 
ies and concluded that the published data fail to 
demonstrate a consistent relationship between 
child training practices and adult personalities. 
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Studies of the relationships between child 
earing practices and adult personality frequently 
ail to define clearly the traits under scrutiny. 
| notable exception is the research of F. Gold- 
nan-Eisler (74) into the connection between 
reast feeding and the later exhibition of traits 
f “orality.” Goldman-Eisler subjected the data 
lerived from careful studies of her adult subjects 
o a factorial analysis from which emerged a 
lefinite correlation between the kind of breast 
eeding experienced (i.e., early weaning or late 
veaning) and the later occurrence of certain 
ersonality traits (characterized respectively as 
rally ungratified and orally gratified types). 
The fact of a correlation is thus clear, but it is 
ypen to two interpretations. The later personal- 
ty may arise from the impact of the earlier ex- 
yeriences, but Eysenck (75) has pointed out that 
yenetic factors may also account for the correla- 
ion in the following way. The polar oral types 
(gratified and ungratified) correspond rather 
slosely to the extravert-introvert dichotomy fa- 
miliar in Jungian psychology. We may plausibly 
suppose that introverted mothers tend to have 
introverted offspring through genetic factors 
alone, and that introverted mothers also tend to 
wean their infants earlier than do extraverted 
mothers. Thus the established correlation may 
arise from either genetic or experiential factors. 
The question of which explanation is the correct 
one must await further research. 

It may be objected that parents influence their 
children through their attitudes to the children 
and that the actual training practices they adopt 
are of quite secondary importance. This shifts 
the argument to another area, but one in which 
there are even fewer objectively derived data. 
Nothing is gained by inferring attitudes on the 
part of the parent from the specific child train- 
ing practices for, as mentioned above, no causal 
connections have been demonstrated between the 
training practices and the later personality. 
Moreover, no correlations have been demon- 
strated between child training practices and the 
attitudes of parents towards children. Some au- 
thors have claimed, for example, that early bowel 
training implies compulsive rigidity on the part 
of the mother or that late weaning reveals indul- 
gence and affection. Yet the same training prac- 
tices occur in widely different cultures in which 
the parents seem to take up quite different at- 


titudes towards the children. One particular 
training practice, e.g., restraint of infants, may 
occur in many different cultures, but in each of 
these cultures occur other child training prac- 
tices which differ (8). The occurrence of one 
or even several particular child training practices 
permits no valid inference either about other 
training practices in the same family or culture, 
or about the over-all attitude of the parents to 
the children (70). The statements of patients 
concerning the attitudes of their parents towards 
them as children obviously deserve no credence 
when we are studying the origins of the person- 
alities of the patients. In the first place, these 
personalities may have provoked the alleged 
parental attitudes, and secondly, patients can 
wildly distort the attitudes of their parents in 
reporting them. 

When it becomes possible to observe, rather 
than merely assume the attitude of parents to- 
wards children, an important connection between 
these attitudes and the behavior of children can 
sometimes be demonstrated. Johnson and Szurek 
studied a number of children and their parents in 
therapy. In this way they related the attitudes 
and suggestions of the parents to the children and 
the behavior of the children. Parental attitudes 
and impulses, often unconscious and communi- 
cated covertly, promoted a wide variety of psy- 
chophysiologic sociopathic symptoms in these 
children (11, 12, 13). Nothing could demon- 
strate better the influence of one human on an- 
other. But the questions of interest to us here 
are whether parents exert a permanent influence 
on children and whether they exert a greater in- 
fluence on children than upon each other. These 
questions the studies of Johnson and Szurek can- 
not, and were not intended to answer. 


CHILD TRAINING PRACTICES AND THE OCCURRENCE 
AND FORM OF LATER MENTAL ILLNESS 


If the experiences of childhood importantly in- 
fluence the later personality, we should expect to 
find some correlation between such experiences 
and the later occurrence of mental disorders. In 
fact, no such correlations have ever been shown. 

Moloney reported an exceedingly low inci- 
dence of mental illness among the Okinawans 
(21). He concluded that the child rearing prac- 
tices of the Okinawans (which include a great 
deal of oral gratification and affection) fortify 
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them against the occurrence of mental illness; yet 
Moloney’s figures of admissions to hospitals for 
mental illness provide no reliable estimate of the 
real incidence of mental disease in Okinawa. 
At the time of Moloney’s observations, the ad- 
missions to hospitals for psychiatric disorders 
must have reflected quite inadequately the inci- 
dence of mental illness among these people. 
Moloney himself comments on 3 factors which 
alone would throw doubt upon the usefulness of 
the figures he quotes. There were almost no 
psychiatric facilities on the islands; the natives 
kept the mentally ill at home; and they treated 
the mentally ill with derision and more serious 
forms of cruelty, including physical violence. 
(Parenthetically, the last feature of their be- 
havior might make one doubt the purported ab- 
sence of mental illness which in Moloney’s paper 
was apparently defined as psychosis.) These 
factors would all tend to reduce the admissions 
to hospitals, but not the occurrence of mental 
illness. 4 

But apart from the questionable validity of 
Moloney’s figures, a study of Okinawans in 
Hawaii (where the Okinawans continue the same 
child rearing practices) showed that this racial 
group has a considerably higher incidence of 
mental illness than other racial groups in Hawaii 
(17). Thus even if we grant that Okinawans 
adjust well to the circumstances of their own 
islands, they apparently adjust less well than 
other groups to certain changed circumstances 
and have no special immunity to mental disorder. 

If good mothering does not confer protection 
against mental disorders, it may be that bad 
mothering or lack of mothering promotes mental 
disorder. On this subject there are more data, 
although none that can permit generalizations. 
Anna Freud studied a group of children who had 
been unusually deprived, through the exigencies 
of war, of all that is generally considered neces- 
sary in the way of good mothering care (23). 
Yet these children made remarkably good adjust- 
ments, perhaps since they were in a group obtain- 
ing from each other what they did not get from 
a mother. This subject will be taken up again 
later. 3 

It is frequently alleged that parental attitudes 
contribute to the formation of a personality spe- 
cially susceptible to schizophrenic reactions. Yet 
different studies on the parents of schizophrenic 


141 


patients fail to show consistent portraits of the 
personalities of the parents or to confirm the 
popular stereotype that the mothers are exces- 
sively anxious, domineering and solicitous with 
regard to the child who later becomes schizo- 
phrenic (16, 17, 18, 19). Moreover, the parents 
of a child who later becomes schizophrenic more 
often than not have raised other children who de- 
veloped normally. No doubt studies of environ- 
mental influences on the formation of personali- 
ties predisposed to psychoses labor under severe 
handicaps. This should make us more rather 
than less cautious in attributing psychoses to such 
influences. We should be all the more cautious 
in view of the much clearer evidence of important 
genetic factors underlying psychoses (20). 

That the forms of mental illness vary widely 
in different parts of the world is abundantly clear 
from comparative studies (24, 25, 26, 27). Some 
of these differences arise from disagreements be- 
tween different cultures as to what constitutes a 
mental illness. Forms of behavior which are 
considered psychotic in our culture may find ac- 
ceptance and even approval in another. How- 
ever, apparently similar mental disorders, e.g., 
schizophrenia, do show somewhat different forms 
in different cultures (25, 27). Psychotic pa- 
tients in India (25) and in Japan (26) apparently 
exhibit much less violent behavior than is cus- 
tomarily seen in the mental hospitals in the West, 
but we have no data which might permit us to 
attribute such differences exclusively to training 
in infancy. For in India passivity, and in Japan 
obedience and obligation, form a central part of 
the training not of infants only, but of everyone 
of any age. 


THE EFFECTS OF ISOLATION IN CHILDREN 
AND ADULTS 


That the isolation of children from other hu- 
man beings can exert a markedly destructive effect 
on personality has been known for centuries 
(28). Recently a number of studies have tried 
to sharpen our understanding of the effects on 
infants and children of isolation and the con- 
comitant deprivation (29, 30, 31, 32). The work 
of Spitz has drawn widespread attention and 
been generally interpreted to confirm the impor- 
tance of adequate mothering during infancy and 
childhood. Spitz compared 2 groups of infants 
who were apparently raised in similar physical 
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circumstances. One group received abundant 
mothering, while mothering attentions were 


sharply curtailed in the other group. The second 
group compared to the first, showed an increased 
morbidity and mortality and a failure to develop. 
Pinneau has criticized Spitz’s work on the grounds 
that he failed to allow adequately for a number 
of factors which could have accounted for the 
differences between the 2 groups, e.g., genetic 
differences, and different exposures to diseases 
such as measles which carried off many of those 
who died in the deprived group (33). But even 
if we grant that Spitz’s data support his conclu- 
sion concerning the harmful effects of isolation 
on infants, two further questions remain. Are 
such effects any less in adults? What is the dura- 
tion of such effects on infants? I shall take up 
the first question next and return to the second 
later. 

In general human beings are rarely as cruel to 
adults as they often are to children. It is difficult 
to find an adult situation which resembles exactly 
the predicament of institutionalized infants. One 
approximately comparable situation may occur 
during the artificial reduction of sensory stimuli 
as in the experiments of Heron (34, 35) and 
Lilly (36). The subjects of these experiments 
were isolated almost completely from sensory 
stimuli. Institutionalized infants are usually iso- 
lated from close human (so-called affective) con- 
tact rather than from all sensory stimuli. How- 
ever, in the “Foundling Home” studied by Spitz 
(30) the babies received a greatly reduced sen- 
sory stimulation due to screening sheets around 
the cots and high walls between the cubicles. 
Such features make the situation of these infants 
resemble rather closely that of the adult experi- 
mental subjects under discussion, although there 
are also differences. 

In the experiments with adults, the subjects 
experienced tension and anxiety which were fol- 
lowed in those who could stand the experience 
long enough by marked disorders of perception 

-~ and thinking. Hallucinations and delusions oc- 
curred in some of “the subjects. Few subjects 
could tolerate these experiences for more than a 
day or two at most. For this reason the possible 
effects of prolonged sensory isolation could not 
be estimated from such experiments. 

Another adult situation with some resemblance 
to that of institutionalized infants occurred in 
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concentration camps (37, 38, 39, 40) and in some 
camps for prisoners of war (41, 42). In these 
camps tHe prisoners were not isolated from other 
people or from stimuli. However, the people 
with whom they were in contact were unable to 
provide them with anything like the usual amounts 
of psychological support because they were either 
hostile guards or fellow-prisoners who were in 
the same desperate plight themselves. It cannot 
be said that loss of affection was the only stress 
to which the prisoners were exposed. Most suf- 
fered from malnutrition and many were subjected 
to physical maltreatment. However, it seems 
reasonable to conclude that the main stresses 
were psychological from the following facts: first, 
marked responses occurred almost immediately 
and before the effects of starvation could have 
influenced behavior; secondly, the psychological 
responses were greater than those accompanying 
starvation alone (43, 44); and thirdly, the re- 
sponses to the situation were far from uniform 
among those who were receiving the same diet 
and mistreatment. Different responses could be 
correlated with different attitudes and personali- 
ties (38, 42). Parenthetically, children adapted 
to concentration camps much more readily than 
adults and the aged least of all (37). 

Turning to the observed responses of inmates 
of concentration camps and prisoners of war 
camps, we find an extraordinarily high incidence 
of psychological disturbances. Severe apathy oc- 
curred almost universally; almost as common was 
the exhibition of fiercely self-interested and hos- 
tile behavior. For many prisoners the psycho- 
logical effects were even more devastating and 
extended to stuporous states, dissociated states 
and death. Some of the inmates deliberately com- 
mitted suicide by annoying the guards to the point 
where the guards shot them or by running against 
the electrified wires around the camp. Many 
prisoners died without sufficient apparent physi- 
cal cause and hence presumably from the psycho- 
logical effects of their situation. It is naturally 
impossible to estimate the incidence of psychoses, 
suicides, and other deaths from psychological rea- 
sons in these camps, nor is it possible to estimate 
the total duration of the effects, although for 
many persons the effects are known to have lasted 
for many years. 

As I said earlier, the experiments of sensory iso- 
lation and the stresses of concentration camps 
certainly do not exactly resemble the situation of 
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institutionalized infants. Nevertheless the situ- 
ations have enough resemblance to permit a com- 
parison in which adults appear no stronger than 
infants. The point of making such a comparison 
is not to suggest that infants cannot be damaged 
by isolation, but to remind ourselves that adults 
are no less vulnerable. The response of infants 
to isolation is not an infantile one, but a human 
one. Studies of the effects of isolation on infants 
teach us the importance of affection to all hu- 
mans; they cannot prove its greater necessity for 
children than for adults. 


IMPERMANENCE OF PSYCHOLOGICAL 
SYMPTOMS OF CHILDHOOD 


Many studies on institutionalized infants and 
on children with psychological disorders have not 
included lengthy follow-ups to observe their later 
course in life. In the studies of institutionalized 
infants by Spitz (30, 31, 32) and Bender (45) 
the children were followed only to early child- 
hood, Goldfarb (46, 47, 48) followed a similar 
group of infants into early adolescence. In all 
these observations although the children showed 
variations in development, in general their matu- 
ration and adjustments fell far behind those of 
children raised under normal circumstances. 

However, Beres and Obers (49) followed into 
late adolescence and early adulthood a group of 
infants who had been reared in institutions com- 
parable to those of the other studies cited. Of 
this group approximately half were judged to have 
made a satisfactory social adjustment. This seems 
like a remarkable degree of improvement, espe- 
cially in view of the fact that such children are 
poorly endowed genetically, having usually unmar- 
ried or mentally ill mothers. 

Caplan studied the children raised in the com- 
munal agricultural settlements of Israel (50). 
The rearing of these children is largely in the 
hands of professional workers who care for groups 
of children. The children live in nurseries and 
later in schoolhouses with other children of the 
same age. They spend some time with their bio- 
logical parents, but nearly all the training and 
discipline are in the hands of the professional 
workers. Any one child will experience two 
changes of workers between birth and 3 years 
of age. The children become strongly attached 
to the members of their own group who apparently 
signify as much for them as do the members of 
their biological families. They usually remain 
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with the same group until adolescence. The im- 
portant observations of these children are that in 
their early years they show marked signs of psy- 
chological disturbance, e.g., temper tantrums, 
thumb-sucking, and enuresis, but that in adult- 
hood they are remarkably healthy both physically 
and mentally. 

A somewhat similar transformation was ob- 
served in a group of 54 severely shy, anxious and 
withdrawn American children, who were disturbed 
enough to be examined in a child guidance clinic 
(51). At the time of the original evaluation, the 
children had a median age of about 7 years. They 
were then studied again 16 to 27 years after the 
initial evaluation. Two-thirds were found to be 
making a satisfactory adjustment and one-third 
a marginal adjustment. Those in the latter group 
were distinguished from the former by not ful- 
filling all their potential or deriving as much en- 
joyment from life as seemed possible. Nearly 
all of these children had married when they 
reached adulthood; many had married outgoing 
wives with whom they shared an active social 
life. Of the entire group only 2 were considered 
ill and only 1 of these was schizophrenic. 


DISCUSSION 


The data reviewed above throw doubt upon 
the belief that the events of infancy and childhood 
are necessarily more formative of personality than 
those of later years. None of the data reviewed 
conflicts with the established fact of human in- 
fluence on human beings, or with our knowledge 
that the impact of one stress with a resultant 
strain modifies the response to a succeeding stress. 
What comes first influences the response to what 
comes after. The events of infancy and child- 
hood will always have much importance because 
of their temporal precedence, but perhaps not be- 
cause of any special fragility of the personality in 
those years. 

This raises the question of the duration of the 
effects of a particular experience. As already 
mentioned, the assertion that the events of in- 
fancy and childhood always exert a special in- 
fluence in forming the adult personality is still 
a statement of opinion, not of fact. Neverthe- 
less, there are many reactions of adulthood which 
seem to repeat or imitate those of infancy and 
childhood. We may account for such resem- 
blances in a number of ways without recourse to 
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the hypothesis of a special impressionability of 
the personality in infancy and childhood. 


ORIGINS OF RESEMBLANCES BETWEEN BEHAVIOR 
IN CHILDHOOD AND ADULTHOOD 


The first possibility in accounting for such re- 
semblances is that a conditioned response fails to 
extinguish because of some innate characteristic 
within the subject. Experiments in conditioning 
show that extinction of a learned response ordi- 
narily occurs rather steadily in the absence of fur- 
ther reinforcement. The learned response is 
greatest immediately after the conditioning ex- 
periences and lessens progressively. However, 
we know that fears and other learned responses 
often fail to extinguish and, on the contrary con- 
tinue an active and irrational course for many 
years. But we also know that the same events 
which stimulate such unextinguishing fears in 
some persons fail to do so in others. They may 
even have the opposite effect. An event which 
proves traumatic to one person may strengthen 
another. The difference presumably lies in the 
way the event is experienced; that is, in the re- 
sponse the person makes to the stimuli which 
events bring him. But we have no reason to be- 
lieve that infants are more liable to experience 
events in a fearful way than are adults. The 
capacity to acquire a fixed, irrational fear or other 
learned response is found in adulthood as much 
as in childhood, and perhaps more so. 

Years ago, Breuer and Freud (52) remarked 
that “the hysteric suffers mostly from reminis- 
cences.” This is true, but it does not follow and 
has never been shown that the events of which 
the reminiscences are partial and distorted mem- 
ories differ significantly for such patients (or other 
patients with psychological disorders), from those 
experienced by other persons. As already sug- 
gested, the events of childhood may be experi- 
enced differently by the psychoneurotic patients. 
Or alternatively, the patients’ childhood experi- 
ences are not unusual, but the patients later at- 
tribute a painful quality to them when viewed 
retrospectively from the current discomforts of 
adulthood, and mixtures of these processes may 
occur, because when a person has difficulty in 
mastering a current conflict, he can readily find 
comfort in attributing his difficulty to previous 
supposedly damaging events. 

Resemblances between infantile or childish and 
adult responses may occur also when a series of 


reinforcements has followed the first harmful ex- 
periences of childhood. One harmful stimulus 
may succeed another so closely that the infant or 
child cannot recover his balance in time to react 
favorably to any event. This is perhaps an im- 
portant factor in the devastating effects of insti- 
tutions on infants and of prison camps on adults. 
The stress is unremitting and harmful effects sus- 
tained. We have then not a personality “fixed” 
by early harmful events, but one bombarded by 
a continuous succession of harmful events. How- 
ever, in ordinary life this situation must be rather 
exceptional. Few children meet unremitting 
cruelty or neglect. Suffering children rather 
readily evoke a tender response in those around 
them. Most children encounter opportunities to 
unlearn whatever negative responses they may 
previously have learned. If they fail to do so 
we can as plausibly attribute the difficulty to an 
innate defect of responsivity as to the severity of 
the previous stresses. 

The most harmful of all experiences seems to 
be a deprivation of stimuli. Apparently growth 
cannot occur in the absence of stimuli from the 
environment. Institutionalized and isolated in- 
fants lack this stimulus and so fail to develop the 
qualities necessary for growth promoting contacts 
at the next stage of development. They thus fall 
behind their more stimulated contemporaries. 
But like those children who receive harmful stim- 
uli, many of these isolated children do respond 
later to stimuli when they receive them. As men- 
tioned in the preceding review of data, many of 
them eventually “catch up” with other more for- 
tunate children of the same age (49). Since 
some institutionalized children can respond fa- 
vorably to stimulation after infancy, the different 
responses to later experiences may lie in consti- 
tutional qualities rather than in the severity of 
deprivation. 

Resemblances between childish and adult re- 
sponses may occur when both express the un- 
changed character of the personality without hav- 
ing any causal connection. Infants at birth show 
wide variations in their spontaneous behavior and 
in their responses to stimuli (53, 54, 55, 56, 57). 
They exhibit the anlage of their fully developed 
characters. For example, Gesell (58) demon- 
strated in young infants the first expressions of 
fundamental traits of personality, e.g., motor ac- 
tivity, affection, humor, curiosity, tolerance for 
frustration, etc. The infants studied showed these 
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traits before the impact of parental behavior could 
have had anything to do with their origin. At 
5 years of age the children exhibited the same 
traits, although in a more developed form, with 
remarkable consistency. It seems reasonable to 
suppose then that many of the responses of in- 
fancy and childhood may be not the causes of 
character, but their expression. To say this is 
not to deny that character can be changed through 
experiences, but as mentioned earlier, it is changed 
by the way in which events are experienced rather 
than by the events themselves. 

Still another resemblance between infantile and 
childish responses may occur in the process of 
regression, in which the adult returns to a previ- 
ous pattern of behavior. But regression indicates 
a current stress too great for mastery. Like 
sleep, it is something of which we all are capable. 
The fact of returning to childlike behavior does 
not mean that the events of childhood were es- 
pecially severe, or even especially important for 
the personality who regresses, although they may 
have been. 

There exist then a number of ways in which 
adult responses to stress may come to resemble 
infantile or childish ones. In each of these ways 
we can account for the resemblance without the 
hypothesis of a special impressionability of the 
infantile personality which would make the events 
of the early years necessarily more important to 
the growth of the personality than those of later 
years. 


SIGNIFICANT DIFFERENCES BETWEEN THE 
PERSONALITIES OF CHILDREN AND ADULTS 


The problem may receive some further clarifi- 
cation from considering the important differences 
between the personalities and behavior of infants 
(and children) and adults. 

We know that infants lack coordination and 
skill in the use of their musculature. We may as- 
sume, although we cannot positively know, that 
infants also lack the organization of perceptions 
and thoughts which comes in later life; yet we 
cannot deduce from these facts and assumptions a 
greater sensitivity to environmental influence. If 
we were to accept a comparison of the infantile 
mentality with that of the dreamer or the delirious 
patient, we would expect the infant to be less sus- 
ceptible to outside influences than he is when his 
mind is more fully developed. 

Impressionability in infants and children may 
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arise from another important difference between 
them and adults. Infants and children lack past 
experience (memory) with which to evaluate cur- 
rent events, but this means that events have a dif- 
ferent significance for the infant and child than 
for the adult. The infant and child respond to 
events according to their meaning for them. We 
have no grounds for believing that events are nec- 
essarily more meaningful or more frequently given 
harmful meanings in infancy than in adulthood. 
They simply have different meanings. If you take 
a toy away from a child, he will probably cry, but 
if you tell him the mortgage has been foreclosed 
he will probably go on playing with the toy. We 
have no proof that within the world as he sees it, 
a stress is any harder to bear in infancy than in 
adulthood 

The experiential deficiencies of children do, 
however, place them at a special disadvantage in 
relationships with adults. Their fund of infor- 
mation is largely drawn from the supplies of par- 
ents. Children are like the country bumpkin in 
the hands of the city slicker. They are the per- 
petual captive audience of their parents. Their 
ignorance makes them more suggestible than 
adults. However, tests of suggestibility show that 
this reaches a peak in the years from 7-9 and 
thereafter falls off, being lower for obvious rea- 
sons in infancy and also in adulthood (58). in 
view of the marked influence of suggestion on per- 
sonality, we may eventually have to ascribe spe- 
cial importance in the formation of personality 
to the years 7-9 as much as to the years of infancy 
and early childhood. 

The physical helplessness of children which ties 
them for many years to one family greatly re- 
duces the opportunities for the correction of faulty 
information provided by the parents. The ordi- 
nary adult has many more opportunities for in- 
creasing his experiences than the child who must 
largely live in a world of experiences chosen for 
him. The immobility of the infant makes him 
particularly dependent upon adults for stimuli 
with which to grow. (The prisoner in a concen- 
tration camp has the same inability to modify his 
experiences.) Thus it happens that a great many 
persons reach adulthood with large areas of living 
completely unexplored. Some of the impression 
that personality becomes fixed in childhood may 
arise from the widespread constriction of experi- 
ences in many children and adults. Their per- 
sonalities fail to change, not because they have 
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permanently jelled, but because they never have 
the new experiences which seem essential for any 
change. Parenthetically, psychotherapy provides 
one kind of intense contact which can modify and 
undo the effects of the experiences of childhood. 
Psychotherapeutic transformations of personality 
in adulthood should additionally warn us against 
viewing the adult personality as rigidly fixed in 
childhood. 

Infants and small children exhibit a further dif- 
ference from adults which at first glance may 
seem to make them more sensitive to environ- 
mental stimuli. Their emotions are less organ- 
ized, less inhibited, and less suited to the occasion 
than those of most adults. Excessive or inappro- 
priate emotional expression perhaps more than 
any other quality gives rise to the epithet “child- 
ish” when seen in adults. Yet from the fact that 
children’s emotions lack the refinements of direc- 
tion and discharge found in most adults, we can- 
not argue that children are thereby experiencing 
more durable effects from the events to which 
they respond. Such durable harmful effects seem 
to come much more often from the inhibition of 
emotional expression than from the reverse. In- 
deed, there may be a connection between the abil- 
ity of children to express emotions freely and their 
well-known resilience to frustration. Very com- 
monly a punished child becomes ready to forgive 
an angry parent long before the parent has recov- 
ered from his own anger (or guilt). Such resili- 
ence in turn probably accounts for the rarity in 
childhood of the prolonged hatreds and guilts 
which burden so many adults. As mentioned ear- 
lier, children adapted best of all to the horrors of 
concentration camps, and they can adapt in ways 
astonishing to adults to a wide variety of new 
situations. Such adaptability is not consistent 
with the view that children are more liable to 
show lasting effects of stresses than adults. What 
we know of the emotional life of children sug- 
gests that they may indeed be more impressionable 
than adults, but also more expressive of responses 
and less retentive of harmful effects. In short, 
their minds may be wax to receive, but not marble 
to retain the imprint of events. 

This review permits no conclusions on this 
topic, except of the need for research. Such re- 
search may ultimately confirm in a scientific man- 
ner the belief that human personality is more 
plastic in infancy and childhood than in adult- 


hood. Alternatively, it may show this assumption 
to have been a scientific myth. A third and more 
probable result may be the demonstration that the 
human personality is more plastic during the early 
years in certain modalities and less plastic in 
others, and similarly we may find that adults can 
change more readily than children in some areas 
and less so in others. 


SUMMARY 


The article reviews data, much of rather recent 
origin, bearing on the assumption that human 
personality is more plastic in infancy and child- 
hood than in adulthood. The available data per- 
mit the following conclusions: 

1. We have no compelling evidence of a pre- 
dictable relationship between child training prac- 
tices and later personality. 

2. Severe psychological stresses can have as 
marked effects in adulthood as in infancy and 
childhood, sometimes having greater effects in 
adulthood than in childhood. 

3. Important personality changes occur after 
childhood (in the absence of treatment) including 
the disappearance of marked psychological dis- 
orders. 

4. Infants reared according to ostensibly ideal 
methods of infant care show no greater immunity 
to mental illness than do other children reared 
differently. Infants reared under apparently in- 
adequate or harmful circumstances do not neces- 
sarily develop psychological disorders. 

5. Resemblances between patterns of behavior 
in children and adults can be explained without 
the hypothesis of a special impressionability or 
vulnerability of personality in childhood. 

6. The initial immobility and the prolonged 
physical dependency of children upon adults 
places them at a special disadvantage in that they 
cannot readily change their environments to ob- 
tain new experiences. A lack of new experiences 
may give to the personality a pattern which ap- 
pears more fixed than it really is. 

The assumption that the human personality is 
more plastic in infancy and childhood than in 
adulthood remains unproven. Neither is it dis- 
proven. We need much further research in this 
area and this research may eventually show that 
the human personality is more plastic during 
childhood in some respects, but not in others. 
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for many years. I don’t know how others react 
to such encounters; my own reaction is typically 
one of surprise on discovering how much my long 
unseen friends have aged in the interim! Those 
of us who have lived long enough to attend a 
score or more annual conventions have for the 
most part accepted the inevitability of aging, and 
implicitly assume that the passing years will be 
accompanied by wrinkles, spectacles, additional 
weight, and grey hair—providing any remains to 
change its color. These and other changes in the 
soma are so highly predictable that we take them 
pretty much for granted. In the realm of be- 
havior we are also accustomed to anticipate cer- 
tain more general changes: a slowing of pace, 
less participation in active sports, and fewer late 
parties. 

On the basis of available evidence, psycholo- 
gists are not likely to anticipate marked changes 
in the intelligence of their friends—at least, not 
until relatively late in life. Even though a col- 
league’s intellectual productivity may decline in 
the middle and later years, we are inclined to give 
him credit for being about as bright as he ever 
was. 

What about our expectations and anticipations 
regarding changes in those other aspects of the 
individual which, for want of a better name, we 
call personality? Do we expect to find our for- 
mer colleague pretty much the same sort of per- 
son that he was 15 or 20 years before, or are we 
prepared to find that he has changed markedly 
with the passing years? William James would 
have expected little or no change. You will re- 
call the passage from his famous lecture on 
“Habit” which reads: 

Habit is thus the enormous fly-wheel of society, 
its most precious conservative agent. . . - Al- 
ready at the age of twenty-five you see the pro- 
fessional mannerism settling down on the young 
commercial traveller, on the young doctor, on the 
young minister, on the young counsellor-at-law. 
You see the little lines of cleavage running 
through the character, the tricks of thought, the 
prejudices, the ways of the “shop,” in a word, 
from which the man can by-and-by no more 
escape than his coat-sleeve can suddenly fall into 
a new set of folds. On the whole, it is best he 
should not escape. It is well for the world that 
in most of us, by the age of thirty, the character 
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has set like plaster, and will never soften again 
(7, p. 121). 

Whether one’s thinking about these matters 
stems from the writings of William James or that 
of other psychological theorists, the answer is 
likely to be the same; on perhaps no other major 
issue do widely variant psychological theories 
lead to such congruent predictions. Whether one 
is an extreme hereditarian, an environmentalist, a 
constitutionalist, or an orthodox psychoanalyst, 
he is not likely to anticipate major changes in 
personality after the first few years of life. Not 
only do psychologists of different theoretical per- 
suasions tend to agree on this issue; it happens to 
be one on which the layman and the scientist 
share a common opinion. Perhaps because of 
the need to believe in consistency of one’s self 
from moment to moment and from year to year, 
we tend to infer an unwarranted degree of con- 
sistency in others. Some consistency is indeed 
necessary for social intercourse, and it is likely 
that, as a matter of convenience in remembering 
and dealing with our associates, we utilize stereo- 
typy to a considerable degree and thus tend to 
infer greater consistency in others than may be 
the case. 

Although diverse theories and lay opinion lead 
to the assumption that there will be but little 
change in personality in adulthood, belief in the 
possibility of inducing change is implicit in the 
professional activities of all persons engaged in 
advertising, public relations, and psychotherapy. 
While theory underlying these activities is often 
not explicitly expressed, anyone who attempts to 
change the attitudes, values, habits, and defense 
mechanisms of adults may be assumed to hold a 
position somewhat as follows: “Yes, it is true 
that the human personality is formed early in life 
and by late adolescence is quite resistant to 
change. However, by the skillful application of 
special techniques, it is possible, though admit- 
tedly difficult, to effect significant changes in be- 
havior.” Some practitioners go as far as to sug- 
gest that it is possible to produce changes in the 
basic personality structure. 

We must pause to note one further exception 
to the otherwise generally accepted assumption 
regarding consistency of the adult personality. 
While assuming that other adults are not likely 
to change, each of us, I suspect, wants to keep 
his theory sufficiently flexible to permit the pos- 
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sibility of changes in himself—especially changes 
in the direction of his ego ideal! Even though in 
retrospect few of these desired changes may have 
occurred, it’s comforting to think that one can 
change if one tries hard enough. 

A more than casual interest on my part in the 
problem of personality consistency in adulthood 
began to develop about eight years ago in con- 
nection with the VA assessment project, in which 
Fiske and I were concerned with the prediction 
of performance of young clinical psychologists 
after four years of graduate training (9). Since 
the potential accuracy of our predictions of fu- 
ture performance was limited by the stability of 
ability and personality variables over the time 
period involved, we became concerned with the 
question of consistency of personality over rela- 
tively long intervals. A review of the literature 
revealed but few relevant studies. By all odds, 
the most extensive evidence available dealt with 
scores derived from the Strong Vocational In- 
terest Blank. Already in 1943, Strong (17) was 
able to report relatively stable correlations of 
vocational interest scores over intervening pe- 
riods of one, two, three, five, six, nine and ten 
years. And in 1951, he reported a median cor- 
relation of .75 for profiles of vocational interests 
for college seniors retested after 22 years (16). 
A few additional studies, reporting the results of 
repeated administrations of other psychological 
tests to college students in one or more successive 
years of their college careers, have appeared; for 
example, Whitely (19) in 1938 reported cor- 
relations for the six scores derived from the 
Allport-Vernon Scale of Values, based on tests 
administered to students as freshmen and seniors. 

Because of the paucity of studies bearing on 
the problem of consistency of personality, Fiske 
and I attempted an evaluation of the consistency 
of personality variables over four years for the 
subjects originally assessed by us in the summer 
of 1947. Since ‚our basic experimental design 
was not oriented to this particular problem, our 
results were in no sense definitive. For example, 
we did not readminister any of the same person- 
ality measures four years apart. We did, how- 
ever, have two sets of comparable data which 
promised to throw some light on the question. 
More specifically, all subjects assessed during the 
summer of 1947 were rated by three peers with 
whom they were in the closest association during 
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the week-long assessment period. Four years 
later, the same subjects were rated on the same 
scales, this time also by peers but not the same 
judges who rated them in 1947. The interjudge 
reliability of the first set of ratings ranged from 
.64 to .92 for 22 variables, with median value 
of .75. Although not computed, there is every 
reason to believe that similar reliabilities char- 
acterized the second set of ratings four years 
later. However, somewhat to our surprise, we 
found that the median correlation between these 
sets of ratings four years apart was only .21, the 
range being .00 to .43. In brief, we were con- 
fronted with the situation that several judges 
looking at samples of behavior of a person at the 
same time agreed reasonably well, but that dif- 
ferent judges looking at samples of behavior of 
the same individual four years apart showed but 
little agreement in their ratings. 

That our subjects were somewhat more con- 
sistent over this period of time than indicated by 
these correlations between the two sets of ratings 
is indicated by the fact that, for each of several 
criterion variables, one or more objective test 
scores predicted performance over the four-year 
time period with validities considerably greater 
than the above median correlations between these 
two sets of personality ratings. We were forced 
to conclude that the relatively low intercorrelation 
between the ratings by the two sets of judges over 
this period of time was a function not only of 
changes in the subjects, but of changes in the 
frames of reference in the judges themselves, 
changes associated with the training program that 
they had undergone during the intervening period. 

Since the completion of the assessment project, 
both Fiske and I have continued to pursue the 
general problem of intra-individual variability, 
he concerning himself with relatively short time 
intervals (6) while I have become more inter- 
ested in time intervals even longer than four 
years. ‘ 
In his presidential address to this Association 
in 1932, Walter Miles lamented the absense of 
evidence regarding human development during 
the period of maturity, later maturity, and 
senescence. He said, 

Psychologists have exhibited great interest in 
the first two and a half decades of life. Insofar 
as human behavior has been carefully measured 
and check-measured, attention has usually been 
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directed to this segment of positive develop- 
ment. . . . Important as this work has been and 
now is, still it leaves five or six decades of human 
adult life relatively untouched. Maturity, later 
maturity, and senescence are still a realm for 
folklore, anecdote, and personal impression (11, 
p. 101). 

During the nearly quarter of a century follow- 
ing Miles’ statement, many psychologists have 
turned their attention to the field of gerontology, 
with the result that Shock in his recent bibliog- 
raphy (14) was able to list over 1,000 psycho- 
logical references. For the most part, attention 
has been directed to the period of adult life 
which Miles termed “later maturity.” Evidence 
regarding the course of human maturation dur- 
ing the adult years is still rather limited. In 
the relatively brief history of psychology, early 
attention was focused first on children of school 
age and next on the earlier years of childhood. 
Still later, a few investigators began to work with 
infants, while the ready availability of college 
subjects led to greatly increased knowledge about 
the period of late adolescence—at least for the 
selected sample of persons who go to college. 

The work of psychologists in the military serv- 
ices during the two World Wars added consider- 
able new knowledge of early adulthood and many 
current investigations are being conducted in in- 
dustry and hence on adult subjects. However, 
for the most part, investigators utilizing adult 
subjects have been primarily concerned with spe- 
cific problems which lead to the employment of 
research designs which, while adequate for the 
problem at hand, rarely yield definitive data bear- 
ing either on the course of development or on 
intra-individual consistency. Many such studies, 
however, especially those involving cross-sectional 
comparisons of different age groups, have pro- 
vided data which suggest the potential importance 
of maturational trends in adulthood. 

While the data provided by cross-sectional com- 
parisons of different age groups are often highly 
provocative, they unfortunately are not adequate 
to permit firm conclusions regarding either de- 
velopmental trends or intra-individual variability. 
In a recent monograph reporting one of the few 
long-term longitudinal studies of mental ability, 
Owens observes: 


. . cross-sectional studies demand an exces- 
sive number of somewhat unlikely assumptions 
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and are therefore open to varying and ambigu- 
ous interpretations. Prominent among the 
problems involved is that it is extremely dif- 
ficult to secure comparable samples of the pop- 
ulation at successive ages, and to be assured 
that they are in fact so comparable that it is 
something more than gratuitous to attribute all 
differences between them to a single variable 

such as chronological age (12, pp. 7-8). 

In the same vein, Kuhlen (10) notes that unless 
sampling is so precise that the younger subjects 
may be truly assumed to be what the older sub- 
jects once were, cultural changes and age changes 
are almost indistinguishable. 

The paucity of longitudinal studies covering 
any major span of adult years is in no small part 
due to the fact that appropriate techniques of 
psychological measurement are themselves just 
coming of age. A few courageous pioneers such 
as L. M. Terman, J. W. Anderson, Walter Dear- 
born, and Jean Macfarlane had enough faith in 
early intelligence tests to undertake long-term 
follow-ups of subjects first studied as children. 
In addition, we have previously mentioned the 
work of Strong on the stability of vocational 
interests. 

In 1952, Madorah Smith published “A Com- 
parison of Certain Personality Traits as Rated in 
the Same Individuals in Childhood and Fifty Years 
Later” (15). While admittedly limited by an N 
of six children of the same family and the absence 
of any objective measures of personality, this in- 
teresting paper pointed to the probability of con- 
siderable consistency of several personality vari- 
ables over a period of nearly half a century. 

In 1953, Owens (12) published the results of a 
study involving the administration of the Army 
Alpha to 127 freshman males at Iowa State Col- 
lege in 1919, and its readministration 30 years 
later. In this as yet little known monograph, 
Owens reports a significant increase in scores for 
five of the eight subtests of the Alpha as well as 
for the total Alpha score, of one half sigma of 
the original distribution. There were no signifi- 
cant decreases in the mean scores on any subtests. 
More relevant to our present interest is the fact 
that the test-retest correlation for the total Alpha 
score over this period of 30 years was .77. Con- 
sidering the fact that the sample of subjects 
studied by Owens represented a restricted range 
of talent, this is indeed convincing evidence of the 
general stability of adult intelligence over a 30- 
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year time span. Further evidence pointing to the 
possibility of continuing intellectual maturation 
during adult years appears in a recently pub- 
lished study by Bayley and Oden (2). These 
investigators administered the Concept Mastery 
test to Terman’s gifted subjects and their spouses 
in 1939-40, and an equivalent form again in 
1950-52. Highly significant increases in scores 
were found for both men and women, for the 
gifted subjects and their spouses, for all occupa- 
tional and educational levels and for all age 
groups. Again, however, the consistency of in- 
tellectual level was high, with test-retest correla- 
tions of about .90. 


THE PRESENT INVESTIGATION ? 


Within the past year I have been fortunate in 
obtaining a considerable amount of data con- 
cerning consistency of selected personality vari- 
ables in the adult personality. This is because 
21 years ago, at the youthful age of 28, I had the 
temerity to plan a longitudinal study. Lest I 
seem to take credit for a degree of foresight which 
I did not have at that time, let me hasten to add 
that the initially projected duration of this study 
was only seven years, For a variety of reasons, 
especially the disturbing effects of World War II, 
the definitive follow-up stage of this study had 
to be postponed so that it is only now being 
completed. 


2 In presenting this first major report growing out 
of this long-term project, | wish to express my appre- 
ciation to the many institutions and individuals con- 
tributing to it. Only one who has carried out an 
extended longitudinal study can fully appreciate the 
many and varied obligations incurred. To the Com- 
mittee for Research in Problems of Sex of the Na- 
tional Research Council I am indebted for grants 
which made possible the initiation of the project and 
collection of the original data between 1934 and 1939. 
A grant from the Faculty Research Fund of the Uni- 
versity of Michigan in 1952 permitted planning the 
follow-up which was transformed into a reality by 
grants during the last two years from the Founda- 
tions’ Fund for Research in Psychiatry. The three 
universities with which I have been associated have 
each contributed research facilities and an atmosphere 
conducive to research. During the last few months 
the International Business Machines Corporation 
greatly facilitated the analysis of the data by making 
available one of its newer electronic computers. A 
score of research assistants have contributed ideas as 
well as helping to carry out the actual work of the 
investigation. Finally, I want to thank the several 
hundred men and women subjects whose intelligent 
cooperation over 20 years made this study possible. 
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Let me be a little more explicit. In 1934, I 
began a program of research designed to answer 
five questions: 

1. How do young men and women pair off in 
marriage? 

2. What characteristics of individuals are as- 
sociated with sexual and marital compatibility? 

3. What combinations of characteristics in hus- 
bands and wives are associated with sexual and 
marital compatibility? 

4. How do individuals change during the 
course of marriage? 

5, How are these changes related to the nature 
of the marriage relationship established? 

During the years 1935-38, I enlisted the 
cooperation of 300 engaged couples. Each of 
these 600 individuals was assessed with an elab- 
orate battery of techniques including anthropo- 
metric measures, blood groupings, a battery of 
psychological tests, and a 36-variable personality 
rating scale. In addition, a personally adminis- 
tered questionnaire was used to obtain essential 
biographical data. 

Each of the participating subjects agreed to 
advise me of the date of his marriage if the en- 
gagement eventuated in a marriage, or of the 
broken engagement if it did not. The original 
research design called for an annual follow-up 
questionnaire from each husband and wife for 
seven years, and retesting at the end of the seven- 
year period. 

The follow-up program was initiated on the an- 
niversary of the first marriage and followed until 
1941, at which time it was interrupted by the 
general dislocation of all civilian activities. The 
subjects were advised of the writer’s intention to 
return to these studies after the war. In spite of 
these good intentions I was not able to give 
serious attention to the project again until 1952- 
53, That year was spent in re-ordering all pre- 
viously collected data and planning a full-scale 
follow-up study to be carried out in 1953-54. 

Plans for this follow-up study called for re- 
contacting as many as possible of the original 
600 subjects, securing as a minimum a report 
on the present outcome of the marriage or of the 
engagement, and inviting all subjects to partici- 
pate in the final follow-up phase of the study 
which included (a) retesting on five of the seven 
psychological tests used in the original battery, 
and (b) reporting in detail on the marriage be- 
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tween research partners and other intervening 
life experiences. 

In spite of the fact that 16 to 18 years had 
elapsed between the time of the original testing 
and the initiation of this major follow-up pro- 
gram, we were successful in securing definitive 
information regarding the present outcome of 
all 300 engagements. Parenthetically, it may be 
of interest to report these outcomes: 278 of the 
original 300 engagements resulted in marriage 
of the research partners. There were 22 broken 
engagements; all but 5 of the 44 individuals in- 
volved later married someone else. Of the 278 
marriages, 12 were terminated by death and 39 
by divorce. After nearly 20 years, then, 454 
of the original 600 persons are still living as 
husband and wife in 227 marriages. 

As might be expected, the subjects, although 
originally contacted in the New England area, 
were when recontacted widely dispersed through- 
out the United States, and several of them live 
in foreign countries. It was therefore necessary 
to plan to collect all data in this follow-up phase 
of the project by mail. Because we planned to 
ask for approximately six hours of further par- 
ticipation on the part of each subject, it was de- 
cided to mail forms to the subjects in two sets. 
The first of these, mailed in August, 1954, in- 
cluded six forms: the five tests being readmin- 
istered, and one new instrument, a specially pre- 
pared form of Osgood’s Semantic Differential. 
These materials were sent to 521 subjects. The 
remainder of 1954 was spent in the preparation 
of two detailed questionnaires, one designed to 
permit each subject to report on the details of 
his own life experience during the intervening 
years, and the other to report the details of his 
marriage. The second set of forms was placed 
in the mail about the first of this year. Com- 
pleted retest forms were returned by 446 of the 
521 subjects, or 86%. While this return is not 
the 100% which we ideally might have hoped 
for, it is sufficiently large to encourage us to be- 
lieve that findings based on an analysis of the 
data will be reasonably representative of the 
entire sample. 

I wish that sufficient time had elapsed since 
the collection of these new data for me to sum- 
marize even tentatively our findings relevant to 
the five questions asked at the beginning of the 
project 20 years ago. Such, however, is not the 


case. In fact, all the data have not yet been 
coded. Fortunately, the personality retest data 
were obtained in time to permit a series of analyses 
concerning the changes in personality variables 
over this fairly long span of years. At this time, 
then, I should like to report to you the findings 
growing out of these analyses. Even with re- 
spect to the problem of personality consistency 
and change, we have not been able to complete 
all of the detailed analyses needed for a defini- 
tive report and interpretation. 

I am sure you will want to know a little about 
the subjects represented in the sample studied. 
At the time of original testing, all were members 
of couples with definite anticipations of marriage. 
The resulting sample is obviously a select one, in 
that it is composed of persons who responded 
positively to an invitation to participate in a long- 
term scientific study of marriage and were will- 
ing to contribute initially six to eight hours of 
their time as well as enter into an agreement to 
report annually for seven years on the outcome 
of their marriage. It is not surprising, therefore, 
that the resulting sample turned out to be superior 
to the general population in education and in- 
telligence. Only 1% of the men never went to 
high school and 75% had at least one year of 
college; nearly 20% had some sort of graduate 
or professional training. The females were some- 
what less selected on the basis of education; 
nevertheless, approximately two-thirds of them 
had attended college for varying lengths of time. 
The IQ equivalent of the mean score on the Otis 
Self-Administering Test of Mental Ability was 
115 for the males and 112 for the females at the 
time of the original testing. The mean age of 
the men at the time of the original testing was 
26.7 and that of the women 24.7, with nearly 9 
out of 10 of the subjects being between the ages 
of 21 and 30. With respect to religious affilia- 
tion, 82% of the males and 89% of the females 
indicated membership in some church. Approxi- 
mately 11% of the sample indicated a preference 
for the Catholic and 8% for the Jewish faith. 

We can never know in what manner and to 
what degree our sample is selected by virtue of 
its being composed of persons who volunteered 
to participate in a study of marriage. Admittedly, 
it does not include, for example, the sorts of 
people who marry impulsively or those who still 
regard marriage as a relationship inappropriate 
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for scientific study. However, in a study such 
as this, one cannot hope for a sample truly repre- 
sentative of the general population. Our goal 
was that of securing a sample with sufficient 
variation on each of the variables studied to per- 
mit analyses of covariance. In this respect we 
succeeded. In spite of the operation of known 
selective factors, the sample studied was char- 
acterized by wide individual differences with re- 
spect to each of the roughly 200 variables on 
which the subjects were assessed. And except 
for education and intelligence, the resulting dis- 
tributions on the other variables were very similar 
to those of normative samples. 

Since in any study of change it is necessary to 
obtain measures at two points in time, the retest 
data which I shall report are based on subsam- 
ples of the original samples: those subjects who 
accepted the invitation to participate in the re- 
test phase of the project. These subsamples in- 
cluded 215 of the original 300 males and 231 of 
the original 300 females. Furthermore, in order 
to facilitate the data analyses, I have excluded 
all cases for whom there was missing any original 
or retest score on any one of the 103 scores 
derived from the five tests. The resulting N’s 
are 176 males and 192 females. As might be 
expected, a comparison of the retested and non- 
retested samples revealed differences on many of 
the original measures. While many of these dif- 
ferences are statistically significant and are of in- 
terest in themselves as characterizing groups that 
did and did not choose to participate in the final 
phase of the project, they are relatively small in 
magnitude and do not show a systematic pattern 
of differences for the two sexes. It appeared de- 
fensible, therefore, to carry out our analyses of 
stability and change on those personality variables 
using the records of the 176 males and the 192 
females for whom complete test-retest data were 
available. Admittedly, our findings will be gen- 
eralizable only to a population of adults suffi- 
ciently cooperative to provide comparable data. 

We should also keep in mind that whereas I 
shall, in most of the analyses, be treating these 
two samples simply as samples of men and women 
in general, they are further selected as being 
primarily the sorts of people who tend to marry. 
Of the 176 males, 146 were still married at the 
time of the retest; of the 192 women, 156 were 
still married at the time of the retest. And, al- 
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though we shall in these analyses not be primarily 
concerned with the marriages of these couples, it 
should be pointed out that 116 of these men and 
an equal number of women were still married to 
each other at the time of the retest. To the degree 
that congruent assortative mating occurred, that 
is, to the degree that like tend to marry like, any 
sex differences in the original test scores will tend 
to be smaller than might be found for samples 
of men and women not married to each other. 
Also, since a man and woman married to each 
other may be assumed to have shared a large 
proportion of the life experiences intervening 
between the two testings, it is possible that sex 
differences in changes in test scores are smaller 
than would be found for samples of men and 
women not married to each other. 


TEST BATTERY 


The original assessment battery selected in 1934 
included the following standardized instruments: 
the Otis Self-Administering Test of Mental Abil- 
ity, the Allport-Vernon Scale of Values, the Bern- 
reuter Personality Inventory, the Bell Adjustment 
Inventory, Strong’s Vocational Interest Inventory, 
and two of Remmers’ Generalized Attitude Scales 
(13), one designed to measure Attitude toward 
any Institution, the other, Attitude toward any 
Activity. Because it seemed likely (then as now!) 
that available techniques did not measure ade- 
quately all potentially important aspects of per- 
sonality, we developed a 36-trait graphic person- 
ality rating scale; this was used to obtain three 
sets of ratings for each subject: by self, by re- 
search partner, and by five acquaintances. 

While we should have liked to have obtained 
retest scores on all of these measures, limitations 
in the total amount of time which could be re- 
quested of subjects dictated some reduction in 
the retest battery. The first test to be eliminated 
was the Otis Self-Administering Test of Mental 
Ability. Being a timed instrument, it was doubt- 
ful that subjects should be asked to administer 
it to themselves under strict time limits. Further- 
more, the definitive results of the 30-year follow- 
up study of Army Alpha Scores by Owens in 1953 
made less essential the inclusion of an intelligence 
test in this study. Since the original battery had 
included two adjustment inventories, it seeme 
reasonable to eliminate one of them; the Bern- 
reuter was chosen over the Bell primarily be- 


PERSONALITY AND DEVELOPMENT 


cause the items in the latter are worded primarily 
for high school students and approximately a 
quarter of the items deal with adjustment to the 
parental home. Finally, although we should 
have much liked to have obtained personality 
ratings by five present associates of our subjects, 
we decided to deny ourselves this luxury, pri- 
marily because securing such ratings proved to be 
one of the more difficult aspects of the original 
assessment program. We did, however, use the 
original 36-trait rating scale in the retest battery 
to obtain two additional sets of ratings by self 
and by partner. 

These five instruments provided us with scores 
on 103 variables. Lest my audience become 
worried that I am about to discuss changes in 
each of these at length, I hasten to assure you 
that such is not my plan. In fact, because of 
probable redundancy in these variables, it was 
not regarded as necessary to analyze all of them 
in detail. Criteria for selection of variables will 
be mentioned as we now turn to the results, in- 
strument by instrument. In an effort to enable 
you to perceive the results more rapidly, I shall 
present these results in the form of graphs rather 
than tables, even though some precision is thus 
lost. 


ECONOMIC 


AESTHETIC 


SOCIAL 


POLITICAL 


RELIGIOUS 


Fic. 1. Allport-Vernon Scale of Valu 


155 


Figure 1 presents the means at Time I and the 
mean. changes after nearly 20 years in scores on 
the six scales of the Allport-Vernon Scale of 
Values. Since the Scale of Values is a relatively 
widely used instrument, I will remind you only 
that it is designed to measure the relative promi- 
nence of six basic interests or motives in person- 
ality: theoretical, economic, aesthetic, social, po- 
litical, and religious. The original form of this 
instrument published in 1931 was used both 
for the original and retest. 

Inasmuch as the same general format will be 
used in presenting the data for the other instru- 
ments, certain general features of the figure 
should be noted. The variables are indicated in 
the left-hand column. The scale over which 
scores may range is shown across the top of the 
figure with the high scores on the right. The 
letters M and F in each of the rows are placed 
at points corresponding to the original mean 
scores of the male and female samples. Mean 
changes in scores for each variable are indi- 
cated by arrows showing the direction and ap- 
proximate magnitude of the changes. These 
changes have been indicated in the figure only 
if the difference was at least 2.5 times its standard 
error, in which case the critical ratio has been 
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indicated in the column on the right-hand side 
of the figure. 

As will be noted, only 5 of the possible 12 
changes on Fig. 1 are significant. By all odds 
the largest, and in fact the most significant, of 
all changes to be reported is that for Religious 
values. Both the men and women score about 5 
points higher in their middle years than as young 
men and women. The change amounts to about 
one-half sigma of the original score distribution. 
Since scores derived from the Scale of Values 
are relative, this shift toward higher Religious 
values was necessarily accompanied by a down- 
ward shift on one or more of the other value 
scales. For the women, most of this downward 
shift occurred in Aesthetic values; for the men, 
it was about equally divided between Aesthetic 
and Theoretical values. Quite frankly, I do not 
know how to interpret this small but significant 
shift toward higher Religious values. Two alter- 
nate interpretations seem equally possible. The 
shift may merely reflect a cultural change which 
has taken place in the last 20 years. Perhaps 
people are generally more religious today than 
they were during the last part of the great depres- 
sion. Equally possible and probably a more 
acceptable interpretation is that in our present- 
day society people tend to become more religious 
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as they grow older. A recent personal com- 
munication from Professor Irving Bender reports 
a similar enhancement of religious values in a 
small group of Dartmouth students retested after 
15 years. 

One additional aspect of this figure deserves 
your attention, again, because it is also character- 
istic of those which follow. Note that while 
small sex differences are reflected in the original 
means of the men and women on certain of the 
scales, there is but little evidence of sex differ- 
ences either in the direction or in the magnitude 
of the changes in scores. In fact, for the 38 
variables to be discussed, the direction of the 
change was the same for men and women on 32 
of the 38. 

Figure 2 presents the story for 6 of the 8 atti- 
tudes measured. (Two of the attitude scales 
were omitted from the present analysis because 
of incomplete data on a number of the subjects.) 
This figure is to be read in the same way as the 
previous one. Note that only the upper half of 
the pro-con continuum is indicated and that the 
original scores of both the men and women were 
favorable toward most of these attitude objects 
and practices. Note, too, that the changes tended 
to be toward the favorable end after 20 years. 
The one exception is Housekeeping, shown on 


Fic. 2. Remmers’ Generalized Attitude Scales. Means at Time I and mean changes 


after 20 years. 


(N = 176 males, 192 females.) 


PERSONALITY AND DEVELOPMENT 


the fourth line of the chart. Here we find that 
men and women, initially mildly favorable in their 
attitude toward this practice, both shift toward 
the unfavorable end of the continuum. Whether 
this reflects a cultural change or the effect of 20 
years of married life, we are not able to say with 
any certainty! 

As a measure of interests, the men’s form of 
Strong’s Vocational Interest Blank was used for 
both men and women; this provided comparable 
measures for each pair of research partners. Both 
the original and retest responses to Strong’s Blank 
were scored on 47 variables. Figure 3 presents 
the results for 11 of the vocational interest scores. 
These ‚particular scales were selected on the basis 
of two criteria: first, each has a relatively high 
plus or minus factor loading on one of the five 
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interest factors, and second, the occupation is one 
which might be followed by either men or women. 

While expected sex differences occur in original 
scores of several of the variables, it is again of 
interest that there are relatively few sex differences 
in the changes in scores. Only 5 of the 22 pos- 
sible changes are statistically significant. In the 
case of the CPA score, both men and women score 
significantly higher after 20 years. The men show 
a small but significant shift toward a lower score 
on the Architect scale and the women, for reasons 
which I shall not attempt to explain, score signifi- 
cantly higher on the scale “President of a Manu- 
facturing Concern.” In general, however, note 
that the picture is again one of few and small score 
shifts for either sex. 

Figure 4 presents the data for five other per- 
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sonality variables. The first two were derived by 
applying the Flanagan keys to the Bernreuter Per- 
sonality Inventory, these having been used in pref- 
erence to the four original keys because the two 
are relatively uncorrelated and account for prac- 
tically all of the variance in the other four. Since 
there are sex differences in the raw score norms 
for these two scales, the means for the men and 
women have been located on a percentile scale. 
While there was no essential sex difference in the 
original score for either of these scales, the women 
show a small but statistically significant shift 
toward greater self-confidence at Time II. I shall 
not venture an interpretation of this change until 
we have had an opportunity to determine whether 
or not it is related to other aspects of married life. 
The other three variables shown on this figure 
are the three nonvocational interest scales derived 
from Strong's blank. The first is Masculinity- 
Femininity. As was to be expected, the original 
means for the men and women are widely sepa- 
rated on this scale, the letters M and F corre- 
sponding to the 30th and 3rd percentiles of the 
male adult norms, and to the Ist and 50th per- 
centiles of the female norms. Not expected on 
the basis of the evidence reported by Strong (17) 
was the small but significant shift in the mascu- 
line direction for both the men and women, espè- 
cially not expected by one who had been associ- 
ated with Lewis M. Terman and Catherine Cox 
Miles in the research reported in the volume Sex 
and Personality (18). In fact, all the evidence 
reported in that volume and by Strong would have 
led to just the opposite prediction. The data of 
Terman and Miles, all based on cross-sectional 
comparisons of groups at different ages and with 
varying amounts of schooling, show that the peak 
of masculinity in males is reached in the high 
school period, and that of the females during the 
college period, after which time both show a trend 
toward more feminine scores, the trend being 
more pronounced for men than for women. 
Again, the interpretation of this finding is haz- 
ardous. It may be that our sample studied longi- 
tudinally points to meaningful trends which were 
masked by cultural differences obtaining in the 
developmental periods of the several age groups 
sampled by Terman and Miles and by Strong. It 
may also be true that the last 20 years have been 
accompanied by cultural changes tending to result 
in more masculine scores for anyone who has 
lived his first 20 years of adulthood during this 


period. To the extent that during this period the 
home has become more mechanized through mod- 
ern appliances, and on the assumption that women 
find that they like the mechanical aspects of home 
appliances, it is understandable that women should 
become somewhat more masculine in their likes 
and dislikes. An equally plausible explanation 
for the shift in masculinity scores in the men for 
the same period is not readily available. Perhaps 
our entire culture is becoming more mechanized 
all the time, and while both men and women react 
favorably to these changes, men respond a little 
more than women. This seemingly simple ex- 
planation may well be the correct one. As an 
hypothesis, it fits both our own findings and those 
reported by Terman and Miles, providing one is 
willing to assume that this mechanization of the 
culture is a process which has been going on 
gradually for several decades. 

The last two scores shown on the figure are two 
additional personality measures derived from the 
Strong Blank: Interest Maturity and Occupational 
Level. It will be recalled that the Interest Matu- 
rity score is based on weights corresponding to 
the differential responses of a representative group 
of United States males at the ages of 15 and 25 
years. At the age of 25 our subjects, both men 
and women, scored at about the 30th percentile 
for 25-year-old men and no significant change 
occurred for either sex over the 20 years. 

The Occupational Level scale is based on 
weights corresponding to the differential responses 
of representative samples of men between the ages 
of 18 and 60, representing what might be termed 
the upper and lower levels of occupations, i.e., 
professional men vs. unskilled men. Here again, 
we note practically identical scores for the men 
and women at the time of the original assessment 
with no significant shift in these scores at the time 
of the later test administration. This point on 
the continuum corresponds to a point about mid- 
way between the mean scores of foremen and 
office workers. 

We now turn to a comparison of self ratings 
made by the subjects at a median age of 25 and 
again 20 years later. Although the rating scale 
used for these self ratings included 36 variables, 
a factor analysis of the ratings of associates showed 
that not more than 10 relatively independent 
dimensions were being tapped by the scale. We 
therefore selected 10 of the 36 variables, each 
with a relatively high loading on one of these 10 
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factors and each with relatively low intercorrela- 
tions with one another. The findings for these 
10 variables are shown in Fig. 5. Since this 
scale was designed for use by relatively unsophis- 
ticated raters, all of the items were originally 
phrased in terms of simple questions such as: 
“How peppy is he? How intelligent is he?” etc. 
The scale was of the graphic type with only three 
“Jandmarks”: a descriptive phrase at each end of 
the scale with the phrase “most people” appear- 
ing at the center of the line. The high and low 
ends of the scales were randomly staggered in an 
effort to reduce halo effect. 

We note first the generally comparable means 
for the men and women in these self ratings. 
While some of the sex differences in the original 
mean ratings are statistically significant, none of 
them are large. Some reason to accept the valid- 
ity of these self ratings is the slight but significant 
difference in self ratings of intelligence by the men 
and women on both occasions, a difference 
roughly proportional to the measured difference 
in intelligence of the two groups. Furthermore, 
self ratings on this simple continuum at Time I 
correlate about .45 with Otis scores. 

Note that significant changes over 20 years oc- 
curred for only 8 of the 20 comparisons. Again, 
too, we find the absence of sex differences with 
respect to these shifts. For each variable show- 
ing a significant shift for the men there is also 
a significant shift for the women. Certain of 
these shifts, although small, are in line with gen- 
eral expectations. Thus, both the men and women 
at the age of 45 rate themselves as somewhat less 
peppy than 20 years earlier; they also report that 
they are inclined to be somewhat less neat in their 
dress and somewhat less broad in their interests. 
I am not sure what to make of the shift toward 
an admitted poorer temper. Perhaps by the time 
one gets to be 45, one is a little more objective 
in evaluating this aspect of one’s personality! 

A summary of the findings with respect to abso- 
lute changes in the mean scores of these 38 per- 
sonality variables is shown in Table 1. We note 
that: 

1. For 20 of the 38 variables, there was no 
significant change in mean score for either sex. 

2. In the case of the 18 variables for which the 
mean change was statistically significant, the mag- 
nitude of the change was still relatively small. 

3. These changes, though small, tend to be in 
the same direction for both sexes. 
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TABLE 1 


NUMBER OF PERSONALITY VARIABLES SHOWING 
SIGNIFICANT CHANGES IN MEANS 


Allport-Vernon 
values 

Attitudes 

Vocational in- 
terests 

Other personality 
variables 

Self ratings 


Total 


4. Even though small, each of the significant 
changes in means is of theoretical interest, but, 
in the absence of adequate age norms at the two 
points in time, may be equally well interpreted as 
due to increasing age or cultural change. 


INTRA-INDIVIDUAL CONSISTENCY OF PERSONALITY 
VARIABLES OVER LONG TIME INTERVALS 


We now turn to an analysis of changes in scores 
on these same 38 personality variables for indi- 
viduals. The absence of mean changes could 
have resulted from either of two states of affairs: 
for any measure, individuals could have shown 
little or no change, or alternately, changes in the 
scores of individuals could have cancelled each 
other. 

In this analysis of change, we shall first com- 
pare the retest correlations over the 20-year time 
span with retest correlations on the same measures 
for relatively short time intervals. Again, we shall 
utilize graphical presentation of the results. 

Figure 6 presents the findings for the Allport- 
Vernon variables. For each of the variables 
shown on the left of the chart, the black bar 
indicates the retest correlation over a period of 
12 months for college students tested by Whitely 
(19) as juniors and again as seniors. The striped 
bar indicates the magnitude of the retest correla- 
tion over the approximately 20-year time span 
for our subjects. 
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In these charts we have combined the data for 
our men and women subjects since the values of 
these correlations for the men and women were 
generally within sampling errors of each other. 
In general, our data lend no confirmation to the 
popular belief that women are more fickle than 
men. 

Looking again at Figure 6, it will be seen that 
for all of the six Allport-Vernon variables, the 
test-retest correlations over 20 years are consider- 
ably smaller than those for the 12-month time 
interval. Thus, the value for the longer time 
interval for the Theoretical scale is .51 for our 
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subjects as compared with .71 reported by 
Whitely. It is also of interest to note that the 
scores on Social values, which are measured less 
reliably than the other five values, show the lowest 
test-retest correlation over the 20-year period. 
Figure 7 presents comparable results for the six 
sets of attitude scores. For these measures, we 
were unable to obtain any test-retest correlations 
over short time periods and, therefore, have 
plotted the black bar to correspond to the re- 
ported Form A-Form B reliability of the scales, 
i.e., retest correlations over a very brief time in- 
terval. It is immediately obvious that the attitude 
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scores of our subjects were much less stable than 
their value scores on the Allport-Vernon. Thus 
we note that there is almost no relationship be- 
tween scores on the attitude toward Marriage at 
Times I and Il. The highest value shown on the 
figure is .33 for attitudes toward the practice of 
Housekeeping, as compared with a reported relia- 
bility of .79 for this particular scale. 

By contrast, over this long time span, vocational 
interest scores for our subjects were relatively 
stable. Figure 8 presents the essential data for 9 
of the 11 vocational interest scores used. Since 
for several of the scales Strong has provided data 
showing test-retest correlations for periods of one 
week and one year (16, p. 78), we have incorpo- 
rated both of these estimates of short-term con- 
sistency in this chart. The black bars refer to 
retest correlations over a period of one week and 
the unshaded bars to correlations for a retest 
interval of one year. 

As was anticipated on the basis of Strong’s pre- 
viously reported findings on the long-term stabil- 
ity of vocational interests, these correlations tend 
to be relatively high; the median is .62 for men 
and .57 for women. While for all scales the 20- 
year retest correlation is somewhat lower than the 
one-year correlation, the difference in the values 
for some occupations is rather small. 


glei of Score 
ofter 20 yeors 


Voriobles O 


040 


163 


Turning now to the other personality variables 
(Fig. 9), we find that the story is much the same. 
Since no retest correlations over short time inter- 
vals were ‚available for the Bernreuter scores, the 
shaded bars correspond to the reported reliabilities 
of these scales. It is of interest to note that the 
retest correlations for the Masculinity-Femininity 
scores are of about the same magnitude as those 
for the vocational interest scores on the Strong 
blank. By contrast, we note a much lower value 
(.46) for the Interest Maturity scores even though 
these two Strong scales have about the same re- 
ported reliabilities and show the same retest cor- 
relations over short time intervals. 

The last line of this chart deserves special atten- 
tion in that it shows the only significant sex dif- 
ference in consistency of personality measures 
over this long time span: a value of .62 for our 
males and .37 for the females. It will be remem- 
bered that the Time I scores on this OL variable 
were approximately equal for the two sex samples 
and that neither group shifted its mean scores 
significantly over the 20 years. This little under- 
stood scale may measure something less relevant 
to women than men, it may measure an aspect 
of personality which stabilizes later in women than 
in men, or this may be just a chance difference at 
the .01 level of significance. 
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What about the consistency of the self percept 
as reflected in self ratings on the personality vari- 
ables at two points widely separated in time? Our 
findings are shown in Fig. 10. The black bars 
indicate the retest correlations between self ratings 
of college sophomores one week apart; the median 
value is .63. Again, we find our retest correla- 
tions after 20 years considerably smaller in mag- 
nitude, yet all statistically significant. The me- 
dian values are .33 for men and .39 for the 
women. 

Just as Strong found the profiles of the Voca- 
tional Interest Test scores to show considerably 
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more long-term stability than scores on individual 
scales, it may be assumed that the stability of the 
over-all self percept is considerably, greater than 
reflected by the median values of these correla- 
tions on single dimensions. As a test of this 
hypothesis, we computed indices of profile con- 
gruency on these 10 self-rated dimensions at two 
points in time. Using a subsample of 20 cases, 
and Kendall’s tau as an index of congruency, the 
median profile correlation over 20 years for these 
10 traits was found to be .55. By way of com- 
parison, the median value for the Allport-Vernon 
profile was found to be .65. Strong has reported 
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a median profile correlation of .75 for the Voca- 
tional Interest profile over 22 years. 

At this point let us summarize the evidence 
concerning the relative consistency in adulthood 
of the several domains of personality variables for 
which data are available. In estimating the rela- 
tive consistency we first corrected the median re- 
test correlation for attenuation, thus providing an 
estimate of the most probable correlation between 
true measures at the two points in time. As an 
index of consistency, it seemed most appropriate 
to utilize the coefficient of determination, i.e., the 
squared values of those coefficients after correc- 
tion for attenuation. The resulting values are 
shown in Fig. 11. 

It will be noted at once that the five domains 
of variables fall into three groups. Values and 
vocational interests are the most stable, each with 
an index of approximately .50. Self ratings and 
the other personality variables are also about 
equally consistent but with indices about .30. The 
lowest consistency appears for attitudes, the index 
being less than .10. While it is essential that any 
generalizations from these findings be limited to 
measured variables of the kind here sampled, it is 
my best guess that this figure fairly accurately 
summarizes the degree of relative consistency that 
characterizes the several domains of personality 
variables. 

In view of the considerable evidence for the 
general constancy of IQ, during the developmen- 
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tal period, and as reported by Owens and by 
Bayley and Oden for adult groups, it is likely 
that intelligence would have appeared at the top 
of this chart, had retest scores been available. 
Next in order among the personality variables we 
find values and vocational interests. Apparently 
these scores are indicative of relatively deeply 
ingrained motivational patterns that do not change 
greatly during the period of middle age. Less 
stable over this long period of time, but as much 
so as scores based on many test items, are self 
ratings on specific personality variables. The rela- 
tive inconstancy of attitudes during the period of 
adulthood came as something of a surprise. While 
it is possible that this relatively low index of con- 
stancy is a function of the particular and limited 
set of attitudes sampled or of the attitude scales 
utilized, I am inclined to believe that further re- 
search will indicate attitudes to be generally less 
stable than any other group of personality vari- 
ables. The relative changeability of attitudes is 
probably a function of their specificity and the 
fact that alternative attitude objects can easily be 
substituted one for the other in the service of 
maintaining an individual's system of values. 
Thus a person with high social values as measured 
by the Allport-Vernon scale might shift his atti- 
tudes toward and even his allegiance from one to 
another of several alternative institutions or organ- 
izations, each dedicated to the service of humanity. 

Although we have thus far been emphasizing 
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the relative consistencies of personality variables, 
I would call your attention to the fact that Fig. 10 
has a “ground” as well as a “figure.” Note the 
relatively wide open spaces to the right of each 
bar. In effect, these are the relative proportions 
of variance which may be expected to change dur- 
ing the period of life with which we are here 
concerned. I venture to say that the potentiality, 
yes, even the probability of this amount of change 
during adulthood is considerably greater than 
would be assumed from any of the current the- 
ories of personality. Similarly I suspect that these 
changes are larger than would be expected by most 
laymen. 

1 find it intriguing to speculate as to whether or 
not these changes in personality variables in adult- 
hood are sufficiently systematic to be predictable 
for individuals. Conceivably they result from the 
interaction of so many varied forces in the lives 
of individuals that prediction of specific changes 
for individuals may not be possible. To the de- 
gree that psychology can develop techniques for 
predicting the magnitude and direction of change 
in individual personalities, we should become more 
effective in the long-term prediction of vocational, 
marital, and emotional adjustment. 

In order to facilitate the analysis of changes in 
scores, a standard score representing the difference 
between Time I and Time II scores was computed 
for each variable, for each of the subjects. In 
computing these standard scores we utilize the 
means and standard deviations of the original dis- 
tribution of scores for each sex. Finally, in order 
to facilitate computation, these standard scores 
were transformed into stanine scores. For each 
individual, then, we had in addition to the original 
and retest scores, a third set of 38 scores indica- 
tive of the direction and magnitude of change 
over the 20-year period. 

Thus far, our studies of change have proceeded 
along the following lines: 

_1, An analysis of the relation of change scores 
to original status scores; 

2. An analysis of the degree to which change 
on specific variables is related to changes on other 
variables; 

3. An analysis of the earlier personality corre- 
lates of change scores for a single variable, Inter- 
est Maturity as measured by the Strong Blank; 

4. An analysis of the relation of changes in 
paired individuals presumably subject to similar 
environmental influences. 
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Time does not permit reporting these studies 
in detail, but I shall summarize briefly the pro- 
cedures and findings for each of them. 


THE RELATION OF CHANGE TO ORIGINAL STATUS 


The first question to which we addressed our- 
selves was: How are these change scores related 
to original status scores? For each of the meas- 
ures, correlations were computed between Time I 
scores and the corresponding change scores. 
Since the change scores are indicative of the direc- 
tion as well as the magnitude of the change, with 
a mean change score indicative of no change, it 
was anticipated that statistical regression alone 
would result in negative values of the status- 
change correlations. Expressed in less formidable 
language, subjects who, because of errors of meas- 
urement, originally receive scores higher than 
their true scores tend on a retest to receive lower 
scores; similarly subjects scoring lower than their 
true scores are likely to score higher on a retest. 
For each measure, therefore, it was necessary to 
estimate, on the basis of the known reliability of 
the score, the probable value of the status-change 
correlation that would result from statistical re- 
gression alone. Obviously, the lower the relia- 
bility of the score, the greater will be the correla- 
tion between original and change score; for a test 
with .00 reliability, the status-change correlation 
due to statistical regression alone would be -707. 

The resulting distributions of obtained and esti- 
mated status-change correlations for the male sub- 
jects are shown in Table 2. These estimates 
assume no change in variance from Time I to 
Time II scores. As will be noted, the obtained 
values tend to be considerably larger than the 
expected, the medians being —.54 and —.30, Not 
shown in the table, but even more pertinent from 
the standpoint of statistical significance, is the 
fact that for each of the 37 variables,® the obtained 
value was larger than the expected. It appears, 
therefore, that we are confronted with a general 
phenomenon which might be called “maturational 
regression,” a tendency for the retest scores of 
extreme scoring subjects to regress toward the 
mean of the group. : 

This phenomenon of maturational regression 


3The value of the Time I change correlation for 
one variable was inadvertently lost in the electronic 
computer! Because the findings were so consistent, 
it was not regarded as necessary to compute it sepa- 
rately. 
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TABLE 2 


DISTRIBUTION OF CORRELATIONS BETWEEN 
ORIGINAL AND CHANGE SCORES FOR 
37 VARIABLES 


(N = 176 Males) 


Expected on Basis of 


r Obtained Statistical Regression 
Minus Values Assuming No Change 
in Variance of Scores 
80-89 | / 
70-79 | // 
60-69 | ///// // 
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appears to account for as much as half the vari- 
ance of change for some variables and for as 
little as 5% of the change variance in other vari- 
ables, It is most dramatically illustrated for the 
variable “Attitude toward Marriage” which it will 
be recalled was one of the variables showing @ 
relatively low test-retest correlation over 20 years. 
Assuming a reliability of .71 for this measure and 
no reduction in score variance, the status-change 
correlation that might be expected on the basis of 
statistical regression alone is —.38; the actual ob- 
tained correlation between status and change is 
—.84 for the men and —.68 for the women. 
Expressed in nonstatistical terms, our subjects who 
tended to have extreme attitudes toward marriage 
at Time I were most likely on the retest to have 
moved to a much more moderate position on this 
continuum. 

It is my best guess that these regressive changes 
of extreme scorers are a function of a variety of 
social forces operative on the individual. If a 
person finds himself too deviant from his group 
on a variable subject to change, he apparently 
finds it easier to shift toward the norm than away 
from it. Obviously, this statement does not hold 
for all individuals; strong ego involvement in one’s 
position on the continuum might well lead to “no 
change” or even to a change in the direction of 
still greater deviance. 
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On first thought, it would appear that regres- 
sion would necessarily result in reducing the vari- 
ance of retest scores as compared with original 
scores. This may occur, but not necessarily. 
Consider the case of successive administrations of 
a test of .00 reliability. As noted above, the 
resulting status-change correlation would be .707, 
yet the variance of the two distributions of scores 
would be essentially the same. Consider also the 
case of filial regression: tall fathers tend as a rule 
to have sons shorter than themselves, and short 
fathers sons taller than themselves, yet the means 
and standard deviation of fathers’ heights and 
sons’ heights tend to be quite comparable. 

For our data, the fact is that for more than half 
of the 38 variables studied, the Time II score 
variances were somewhat smaller than those for 
Time I. These differences were large enough to 
achieve statistical significance, however, for less 
than a fifth of the variables. The most significant 
reduction in variance occurred in “Attitude toward 
Marriage,” “Attitude toward the Church” and 
self ratings on “modesty.” These were variables 
for which the status-change correlations were also 
high—all above .70. 

That a highly significant amount of maturational 
regression may occur without a corresponding de- 
crease in the variance of Time II scores is shown 
for the variable Interest Maturity. Because of the 
high reliability of this variable (.93) the status- 
change correlation expected from statistical re- 
gression is only —.19; the actual values are —.49 
for the men and —.53 for the women, yet the 
Time I and Time II variances are almost equal in 
size. That persons scoring low on Interest Matu- 
rity at the age of 25 might be expected to increase 
their scores 20 years later is hardly surprising. 
Why persons originally scoring high on this vari- 
able should regress and at the age of 45 score 
more like 15-year-olds than they did at the age 
of 25 is an intriguing matter to which we will 


return later. 


INTERRELATIONSHIPS OF CHANGE SCORES 


Our next attempt to explore the phenomenon 
of personality change started with the question: 
if an individual changes on one personality vari- 
able, is this change a relatively specific one or 
is it likely to be accompanied by changes on one 
or more other variables? For each sex group we 
computed the 38 X 38 matrix of intercorrelations; 
note, however, in this case we were dealing not 
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with the usual set of intercorrelations of test 
scores, but with the intercorrelations of the differ- 
ences between scores at Time I and Time I. 

For obvious reasons, I shall not ask you to look 
at the resulting matrices. I do, however, wish to 
call your attention to certain of their features. 
First of all, as has been generally true for the 
previously reported analyses, the values tended to 
be very similar for the men and women subjects. 
Secondly, the values of the intercorrelations 
tended to be low: of the 703 intercorrelations in 
each matrix, less than 20% were significant at 
the 1% level. Thirdly, changes on approximately 
half of the variables were found to be unrelated 
to changes on any of the other 37 variables. All 
of these facts point to the conclusion that person- 
ality changes as reflected in these difference scores 
tend to be relatively specific. 

As examples of this specificity, I shall present 
two small segments of the total correlational 
matrix for the men. Table 3 presents the inter- 


TABLE 3 
INTERCORRELATIONS OF CHANGE IN SELF RATINGS 
(N = 176 Males) 


Variables 1]2]3]|4|5 6 7 8} 9 }10 

1 Pep — | 09 | 14 | 09 | 18 02 | —11 | 09 | 00 | 18 
2 Intelli- 

gence — |19 |05 | 11 07 04 | 02 | 06 | 18 
3 Voice — | 28 | 10 | —04 | —03 | 19 | 03 | 12 
4 Dress —] 14 13 | —03 | 17 | 01 | 29 
5 Interests — | —04 | —06 | 07 | 19 | 18 
6 Conven- 

tionality > 13 | 05 | 09 | 20 
7 Boisterous- 

quiet — | 14) 21) 13 
8 Temper — |11 |21 
9 Modesty — | 16 
10 Depend- 

ability — 


correlations of changes in self ratings on the 10 
personality variables. Of these 45 correlations 
only 5 are significant at the 1% level, and these 
are relatively small in magnitude. Even though 
these 10 variables were selected from the original 
36 as relatively uncorrelated, I had fully expected 
evidence for one common factor in this little 
matrix—a factor reflecting :a shift over the 20 
years in the general level of self esteem. If such 
a factor is operative its contribution to variance 
of changes in self ratings is extremely small. 

As a second example, we now turn to that sec- 
tion of the matrix showing the intercorrelations 
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of changes in attitude scores. It will be recalled 
that these scores had a relatively low index of 
consistency over the 20-year interval, hence the 
changes on them might well covary. The facts 
are shown in Table 4. Again the intercorrelations 


TABLE 4 


INTERCORRELATIONS OF CHANGE ON 
Sıx ATTITUDE SCORES 


(N = 176 males) 


1 Marriage — | 24 | 07 | 01 | —06 | 08 
2 Church — | 10 | 07 10| 13 
3 Rearing children — | 20 121 "ae 
4 Housekeeping 2 Ve 
5 Entertaining —|-01 
6 Gardening == 


are generally low, only 3 of the 15 reaching a 
value of .20. 

In view of the generally low correlations among 
the change scores, our original plan of factoring 
the entire matrix was not carried out. Further 
inspection showed significant correlations of 
change scores among the six value scores; these 
were anticipated because of the manner in which 
the scores are derived, i.e., one can increase his 
score on a single scale only by decreasing it on 
one or more of the other five. Even under these 
circumstances ihe highest intercorrelation (for the 
males) was —.39 indicating a tendency for Eco- 
nomic and Aesthetic value scores to change in 
opposite directions. 

Similarly, in view of the fact that many items — 
on the Strong Blank contribute to several scores 
derived from it, significant intercorrelations of 
change scores were expected and found. In gen- 
eral, these were of the same sign and magnitude — 
as the correlations reported by Strong among the — 
scales. For example, score changes indicating a 
subject's interests becoming more like those 
Personnel Manager correlated +.73 with changes — 
in the direction of higher Interest Maturity. The 


correlation reported by Strong between these two 


scales is +.75. 


Inspection of the intercorrelations between 


changes in Allport-Vernon and Strong scores 
showed more than a chance number of significant — 
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relationships but not as many as might have been 
expected on the basis of the common factors 
shown to underlie these two sets of measures in 
studies by Ferguson, Humphreys, and Strong (5) 
and by Duffy and Crissy (4). 

Had relatively high intercorrelations been found 
among these change scores, we would have at- 
tempted to identify the one or few common fac- 
tors and their early personality correlates. How- 
ever, the relatively marked specificity of these 
change scores suggests the fallacy of current at- 
tempts to posit and assess a global trait of person- 
ality rigidity. Our findings are in line with those 
of a number of recent studies reporting generally 
low and insignificant correlations among so-called 
measures of rigidity (7, 3). 


CORRELATES OF CHANGES IN INTEREST MATURITY 


Had the interrelationships among the change 
scores pointed to the existence of one or more 
general factors, we had planned to describe the 
soris of people who do and do not tend to show 
marked personality changes in adulthood. In 
view of the lack of evidence for any general fac- 
tor of change, we decided to carry out a more 
limited study of the earlier background correlates 
of one set of change scores, those for the Strong 
variable, Interest Maturity. It will be recalled 
that this variable was one for which there were 
no sex differences in original scores and no sig- 
nificant change in means or variances over the 
years for either sex. Furthermore it was a vari- 
able for which Time II scores showed consider- 
ably more regression toward the mean than was 
expected from statistical regression alone. These 
facts posed the interesting question: what kinds 
of people tend between the ages of 25 and 45 to 
change their scores on the continuum, reflecting, 
on the one end, the modal interests of 15-year-old 
boys, and on the other, the modal interests of 25- 
year-old men? 

In this analysis, carried out for the men only, 
Interest Maturity change scores were correlated 
with 66 measures obtained at Time I. Included 
were Time I scores on the 38 variables treated 
throughout this report, age, height, Otis 1Q, edu- 
cation, church membership, and similar back- 
ground variables. 

Here again the results can be summarized very 
briefly: only 8 of the 66 correlations were signifi- 
cant at the 1% level. Six of these 8 were nega- 
tive correlations with Time I Strong scores: Inter- 
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est Maturity and occupational scores for Per- 
sonnel Manager, Mathematics-Physical Science 
Teacher, Social Science Teacher, Minister, and 
Senior CPA. In general, those men who showed 
early interests similar to men in the above profes- 
sions were more likely to score more like 15-year- 
olds at age 45 than they did at age 25. Con- 
versely, the less our subjects were like men in 
these professions at Time I, the more they tended 
to score higher on the Interest Maturity Scale at 
Time II. 

There were two other significant correlates of 
Interest Maturity change scores: age at the time 
of the first test, and attitude toward rearing chil- 
dren. Increase in Interest Maturity scores tended 
to go with younger age and with more favorable 
attitudes toward rearing children at time of first 
testing. 

Taken together, these findings suggest the pos- 
sibility that for some men, there occurs an early 
(and perhaps premature) development of voca- 
tional interests characteristic of professional per- 
sons who work with and try to help people; this 
may lead to later disillusionment and a tendency 
to develop interest patterns more characteristic of 
persons who prefer to work with ideas and things 
rather than directly with other human beings. 
Changes in the direction of lowered maturity of 
interests were found to be significantly associated 
with interest changes toward those of an architect, 
a mathematician, and a president of a manufac- 
turing concern. Lest it be assumed that I am 
making a value judgment regarding such changes 
of interest, let me remind you that members of 
these three professions may do fully as much for 
their fellow men as members of professions who 
prefer. to help people through interpersonal rela- 
tionships. Furthermore, the fact that change 
scores on Interest Maturity do not correlate sig- 
nificantly with changes in the Allport-Vernon 
value scores suggests that one may change his 
vocational interests without necessarily shifting his 
basic system of values. 


RELATIONSHIP OF CHANGES IN PAIRED INDIVIDUALS 
PRESUMABLY SUBJECT TO COMMON SOCIAL FORCES 


In our last exploration of changes in the 38 
personality variables, we capitalized on the fact 
that 116 of the male subjects and an equal num- 
ber of the women had been members of a close 
diadic group through the time span. As husband 
and wife, each subject had presumably filled a 
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relatively prominent role in the social environ- 
ment of another. Was there any systematic re- 
lationship between the changes in one and the 
original scores of the other member of the pair? 

To answer this question, correlations were com- 
puted between each set of change scores and the 
original scores of the spouse. The results can be 
summarized very briefly: for both cross-spouse 
comparisons, the correlations were all relatively 
low indicating but little tendency either for the 
husband to change toward the original score of 
his wife or the wife to change toward that of her 
husband. In fact, although the magnitude of 
most of the correlations was not large enough for 
them to be individually significant, nearly three 
out of four were negative, indicating a slight trend 
for changes for both the husband and wife to be 
away from the original score of the other. Since 
the directions of these relationships tended to be 
similar to those between status and change for 
men and women separately, it seemed likely that 
the change scores of husbands and wives, presum- 
ably subject to many of the same social forces, 
would be positively related. While these correla- 
tions were found to be generally positive, they 
were also small, achieving statistical significance 
for only 4 of the 38 variables: economic, social 
and religious values, and attitude toward marriage. 

It is commonly believed that persons married 
to each other tend with the passing years to be- 
come more and more similar; in fact, I have even 
heard it said that this principle holds for physical 
appearance. Obviously, our data provided for a 
direct test of this hypothesis. For the 116 cou- 
ples, husband-wife correlations were computed 
between Time I scores for each of the 38 variables 
and again at Time II. In line with our prelimi- 
nary report on assortative mating for the 300 
engaged couples (8) of which these 116 constitute 
a subsample, the Time I correlations were found 
to be positive for practically all variables, ranging 
from —.02 to .58. In other words, we found no 
evidence * to support the opinion that “opposites 
attract.” 

What about the Time II correlations? In gen- 
eral, they proved to be no different than those at 
the time of original testing! Actually they were 
slightly smaller for 21 of the 38 variables and 


4 Note, however, that our personality variables did 
not include any measures of “needs” which Winch, 
Ktsanes, and Ktsanes (20) believe to be negatively 
correlated in assortative mating. 
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the few statistically significant shifts were in the 
direction of the couples becoming less similar 
with the elapse of 20 years. However, since some 
of the Time II correlations were attenuated be- 
cause of slightly reduced variances of measures 
on the retest, the most conservative generalization 
seems to be that the initial similarity between 
husbands and wives becomes neither greater nor 
less with the passing years. Apparently the initial 
similarity is adequate for most husbands and wives 
to establish and maintain a cohesive relationship 
without the need to become more alike. And 
while we can readily think of many forces tend- 
ing to promote increasing congruence between 
mates, we must not overlook the apparently equal 
impact of centrifugal forces associated with main- 
taining the many kinds of role differentiation ex- 
pected of husbands and wives in our culture. 

This completes the report of our explorations 
of personality consistency and change in adult- 
hood. The sample of variables studied was nec- 
essarily limited to the techniques available 20 
years ago, but the results for the several variables 
are so consistent that we may accept them as 
pointing to generalizations that are likely to be 
confirmed in later research. 

With respect to personality consistency, our 
results can and probably will be used to support 
very different theoretical positions. Absolute 
changes in personality scores tended to be small 
but similar in direction and magnitude for men 
and women. We found evidence for considerable 
consistency of several variables, in spite of fallible 
tools and a time span of nearly 20 years. But we 
also found evidence for considerable change in 
all variables measured. These changes were 
shown to be relatively specific rather than reflect- 
ing any over-all tendency to change. While meas- 
urable changes occurred on most variables, it 
appears that correlates of these changes are many 
and elusive, and hence changes in scores are likely 
to be difficult to predict for individuals. Finally, 
we found that the measurable changes showed 
little or no relation to known forces assumed to 
be dominant in an individual’s immediate social 
environment; this finding points to the probable 
difficulty of obtaining firm knowledge concerning 
the mechanisms of effecting change. 

The intensive study of any aspect of growth 
and development cannot but serve to increase 
one’s respect for the integrative capacities of the 
human organism. Beginning with the complex 
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structures and functions provided by its unique 
genetic constitution, each organism, while main- 
taining its organic integrity and a considerable 
residue of its original nature, moves through its 
maturational cycle adapting to and permitting 
itself to be modified by selected aspects of its 
immediate environment. These adaptive changes, 
occurring most rapidly in the years of infancy 
and childhood, are so appropriately timed that 
they do not threaten the organism either physio- 
logically or psychologically. Our findings indicate 
that significant changes in the human personality 
may continue to occur during the years of adult- 
hood. Such changes, while neither so large nor 
sudden as to threaten the continuity of the self 
percept or impair one’s day-to-day interpersonal 
relations, are potentially of sufficient magnitude to 
offer a basis of fact for those who dare to hope 
for continued psychological growth during the 
adult years. 
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SECTION V 


Social, 
Cultural, and 
Personality 


Measures 


The fields of personality and social psychology 
have, over the years, shown considerable overlap 
with respect to theories and problems studied and 
methods employed. This state of affairs would 
seem to be all to the good in view of the mutual 
influence of personal and environmental influ- 
ences on each other. Just as the study of human 
development contributes to a better understand- 
ing of present personality functioning, so also 
does the study of group and environmental 
forces contribute to an understanding of the mold- 
ing of personality. The articles in the present 
section reflect the close tie that exists between 
personality characteristics and the social environ- 
ment. 

McArthur, Waldron and Dickinson, in their 
article, present an analysis of the personality and 
social factors influencing the acquisition of the 
smoking habit. The tendency toward such re- 
sponses as heavy smoking, light smoking and 
cessation of smoking were found to be influenced 
by personality and social variables. 

Newcomb’s article, which begins this section, 
is important in, at least, two respects. In the first 
place, it offers an interesting formulation con- 
cerning the joint influences of individuals’ per- 
sonality characteristics and the social context on 
interpersonal attraction. In the second place, it 
describes an experimental methodology by means 
of which the formulation can be empirically 
studied. The evidence presented in this paper 
suggests that the degree to which an individual's 
attitudes toward himself are confirmed by other 
individuals in the social environment is a signifi- 
cant factor in the prediction of interpersonal at- 
traction. Newcomb’s work is particularly inter- 
esting because it represents an effort to account 
for interpersonal phenomena through an amalgam 
of concepts from the fields of learning, social 
psychology, personality, and perception. 

Whereas Newcomb’s concern can be viewed as 
individuals’ approach responses to other individ- 
uals, McArthur, Waldron, and Dickinson's inter- 
est centers on individuals’ approach responses to 
a particular class of objects. In their article, they 
present an analysis of the personality and social 
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factors which may influence the acquisition of 
the smoking habit. Their empirical findings in- 
dicate that the tendency toward such responses 
as heavy smoking, light smoking, and cessation 
of smoking are significantly influenced by per- 
sonality and social variables. 

One social variable which has received consid- 
erable attention both theoretically and empirically 
is that of group belongingness, In the research 
of Lewit we see an effort at examining one aspect 
of minority group belongingness, the manner in 
which members of a minority group orient them- 
selves with respect to a more dominant social 
group. Using two groups of Jews, differing in 
their attitudes and reported behavior with respect 
to other Jews, Lewit sought to determine whether 
or not these groups would differ in social prefer- 
ences from each other and from a non-Jewish 
sample. 

Whereas the first three articles in this section 
deal with the influence of social variables on be- 
havior in the United States, Beier and Hanfmann 
set as their task the personality description of per- 
sons, former Soviet citizens, whose cultural back- 
grounds are quite different from ours. By means 
of interviews and the use of projective questions 
they then compare these descriptions with com- 
parable descriptions obtained for a group of 
Americans, By means of analyses of the content 
of the verbalizations of their Soviet and American 
subjects, it was possible for Beier and Hanfmann 
to draw a number of inferences concerning per- 
sonality and emotional differences between the two 
groups. 

The studies presented in this section attest to 
the interest of many researchers in comparing 
groups differing on socially relevant dimensions. 
Many social psychologists are, also, concerned 
with the study of behavior in experimentally cre- 
ated group situations. Is an individual’s behavior 
influenced by the presence or absence of a group? 
Crutchfield’s findings reported in the final Paper 
of this section present convincing support for the 
hypothesis that the immediate experimentally cre- 
ated social environment may exert an important 
impact on an individual's judgments and perform- 
ance. It seems reasonable to conclude from 
Crutchfield’s work that a fruitful avenue of future 
research on behavior in groups will be the analysis 
of those individual differences in personality which 
affect an individual's responses to group pressures. 
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THE PREDICTION OF 
INTERPERSONAL ATTRACTION * 


THEODORE M. NEWCOMB ! 


During the past 30 years, according to my 
estimate, 9,426 articles and books, plus or minus 
2,712, have been published in English on the 
topic of “attitudes.” A large proportion of them 
deal with attitudes toward people—most com- 
monly toward family members, toward categories 
like ethnic, religious, or occupational groups, or 
toward prominent individuals like Franklin D. 
Roosevelt or Adolf Hitler. At the level of psy- 
chological generalization, such studies have prob- 
ably taught us more about the organization of 
individual personality, and about group influences 
upon individual motivation and cognition, than 
about the nature of person-to-person relation- 
ships. At any rate it seems appropriate to pose 
the question whether persons, as objects of atti- 
tudes, have properties that distinguish them from 
other classes of objects. If so, it is possible that 
the determinants of attitudes toward persons are 
in some respects different from those of other 
attitudes. Since it is convenient to have a distinc- 
tive label for something that one wishes to keep 
distinct, I shall use the term “attraction” to refer 
to attitudes toward persons as a class of objects. 

Today I shall be primarily (though not exclu- 
sively) concerned with the motivational-affective 
aspects of attraction. Though I shall be referring 
mostly to its simpler manifestations—like choos- 
ing to spend time with a person, or expressing a 
generally favorable attitude toward him—I want 
to note, in passing, that there are several dimen- 
sions of attraction which are operationally dis- 
tinctive and (to me, at any rate) conceptually 
necessary. Though I shall not stop to label these 
dimensions, what they all have in common is 
degree and direction on an approach-avoidance 
continuum, together with associated cognitive 
content. 

I think it not much of an exaggeration to say 
that there exists no very adequate theory of in- 

* Reprinted by permission from The American Psy- 
chologist, November, 1956, Vol. 11, No. 11, 575-586. 

1 Address of the President at the Sixty-Fourth An- 


nual Convention of the American Psychological As- 
sociation, Chicago, Illinois, September 2, 1956. 


SOCIAL, CULTURAL, AND PERSONALITY MEASURES 175 


terpersonal attraction. It has often seemed to 
me that even we psychologists, who like to pride 
ourselves in recognizing that nothing occurs 
apart from its necessary and sufficient conditions, 
have come very close to treating the phenomena 
of personal attraction as an exception to the 
general rule. It is almost as if we, like our lay 
contemporaries, assumed that in this special area 
the psychological wind bloweth where it listeth, 
and that the matter is altogether too ineffable, 
and almost passeth even psychological under- 
standing. 

I hope you will regard this last comment as 
being in part, but not in toto, a rhetorical exag- 
geration. The fact is, of course, that both theo- 
retical and empirical efforts have been devoted to 
the problem. To some of these I now turn. 

Perhaps the simplest—and, in many ways, still 
the most convincing—of the notions concerning 
determinants of positive attraction is that of 
propinquity. In its baldest form, the proposition 
of propinquity reads as follows: other things 
equal, people are most likely to be attracted 
toward those in closest contact with them. Every- 
day illustrations readily leap to mind. Adults 
generally have strongest attraction toward those 
children, and children toward those adults, with 
whom they are in most immediate contact— 
which is to say, their own children and their own 
parents. And this commonly occurs, let me re- 
mind you, in spite of the fact that neither parents 
nor children choose each other. Or, if we are 
willing to accept the fact of selection of marriage 
partners as an index of positive attraction, then 
the available data are strongly in support of a 
theory of propinquity. If we use an adequate 
range of distance—miles, or city blocks rather 
than yards, or within-block distances—there is a 
neat, monotonic relationship between residential 
propinquity and probability of marriage, other 
criteria of eligibility being held constant (e.g., 
Tez ays 

It is, of course, a truism that distance per se 
will have no consequences for attraction; what 
we are concerned with is something that is made 
possible, or more likely, with decreasing distance. 
I think we may also consider it a truism that that 
something is behavior. Further, it is behavior 
on the part of one person that is observed and 
responded to by another; it is interaction. So 
widespread and so compelling is the evidence 


for the relationship between frequency of inter- 
action and positive attraction that Homans (9) 
has ventured to hypothesize that “If the fre- 
quency of interaction between two or more per- 
sons increases, the degree of their liking for one 
another will increase.” Actuarially speaking, the 
evidence is altogether overwhelming that, ignor- 
ing other variables, the proposition is correct in 
a wide range of situations. 

Why. should this be so? Accepting the propo- 
sition only in an actuarial sense, and ignoring 
for the moment the other variables obviously in- 
volved, what theoretical considerations will en- 
able us to make psychological sense out of it? 
The principle which comes first to mind is that 
of reward and reinforcement. Two simple as- 
sumptions will enable us to make direct use of 
this principle: first, that when persons interact, 
the reward-punishment ratio is more often such 
as to be reinforcing than extinguishing; and sec- 
ond, that the on-the-whole rewarding effects of 
interaction are most apt to be obtained from those 
with whom one interacts most frequently. These 
assumptions, together with the principles of re- 
ward and reinforcement and canalization, would 
account for the general association of frequency 
of interaction with positive attraction; they would 
not, of course, account for the many observed 
exceptions to the generalization. 

To return to my earlier illustrations, this set 
of assumptions and principles would not apply 
in exactly the same way to the facts of attraction 
between parents and children and to the facts of 
marital selection. One difference, of course, is 
that selection is possible in the latter but not in 
the former case. As applied to the facts of 
parent-child attraction, the principle of propin- 
quity asserts, in effect, that we are attracted to 
those whom “fate” has made rewarding. As 
applied to the facts of marital selection, the prin- 
ciple of propinquity says little more, in addition 
to this, than that the likelihood of being rewarded 
by interaction varies with opportunity for inter- 
action. The problem of selection, among those 
with whom opportunity for interaction is the 
same, still remains. 

The principle of generalization has often been 
called upon to account for selective attraction 
among those with whom opportunities for inter- 
action are the same. Many Freudians, in par- 
ticular, have assumed that in adolescence or 
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adulthood attractions are largely determined by 
personal qualities resembling those of parents or 
siblings, initially determined by the Oedipus con- 
figuration—as illustrated by the old refrain, “I 
want a girl just like the girl that married dear 
old Dad.” This principle, together with its vari- 
ants, obviously cannot be omitted from a com- 
plete theory of interpersonal attraction, but 
neither can it be considered as a major contribu- 
tion to it, since, in itself, it says nothing about 
the initial basis of attraction but only about exten- 
sions from one already attractive person to an- 
other, similar one. Perhaps the chief contribu- 
tion of the principle of generalization lies in the 
enhanced probability that thresholds for inter- 
action with persons resembling those toward 
whom one is already attracted are lower than for 
other persons; if so, then the likelihood of the 
rewards of interaction with such persons is 
greater than for other persons. 

There is an interesting consequence of the 
proposition that attraction toward others varies 
with the frequency of being rewarded by them. 
Opportunities for being rewarded by others vary 
not only with propinquity, as determined by ir- 
relevant considerations like birth and residence, 
but also with the motivations of the potentially 
rewarding persons. This suggests that the likeli- 
hood of being continually rewarded by a given 
person varies with the frequency with which that 
person is in turn rewarded, and thus we have a 
proposition of reciprocal reward: the likelihood 
of receiving rewards from a given person, over 
time, varies with the frequency of rewarding him. 
This proposition is significant for my problem in 
various ways, especially because it forces further 
consideration of the conditions under which con- 
tinued interaction between the same persons is 
most likely, and under which, therefore, the pos- 
sibilities of continued reciprocal reward are 
greatest. 

The first of these may be most simply described 
as the possession by two or more persons of 
common interests, apart from themselves, that 
Tequire interdependent behavior. If you like to 
play piano duets, or tennis, you are apt to be 
rewarded by those who make it possible for you 
to do so, and at the same time you are apt to 
reward your partner. Insofar as both partners 
are rewarded, another evening of duets or an- 
other set of tennis is likely to ensue, together 
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with still further opportunities for reciprocal re- 
ward. Thus attraction breeds attraction. 

The second condition favorable to continued 
reciprocal rewards has to do with complementary 
interests (rather than with similar ones) that 
require interdependent behavior. These are sym- 
biotic relationships, like that in which cow and 
cowbird become attracted to each other: the cow 
provides sustenance for the bird in the form of 
parasitic insects, the removal of which is reward- 
ing to both. Or, at the human level, consider 
the exchange of gratifications between a pair of 
lovers. Here, too, under conditions of comple- 
mentary rather than of similar motivations, the 
general rule is that attraction breeds attraction. 

There have also been interesting attempts, of 
late, to test the proposition that symbiotic person- 
ality needs tend to characterize marriage part- 
ners—who, it may be presumed, are reciprocally 
attracted to a greater than average degree. Pro- 
fessor E. L. Kelly’s work, some of which was 
reported on this occasion one year ago (11), has 
quite consistently revealed the existence of sim- 
ilar rather than complementary traits, both among 
spouses twenty-odd years married and among 
engaged couples. It is interesting, however, that 
his findings since last year suggest a curvilinear 
relationship between initial homogeneity and mar- 
riage durability; the best prognosis is provided 
by neither too much nor too little similarity. 
These findings, however, are not conclusive for 
my present problem—first, because there are 
many determinants of marriage durability other 
than personal attraction; and second, because 
comparatively few of the traits that he measured 
were such as could either confirm or disconfirm 
the hypothesis of personality symbiosis. 

This problem has, however, been directly at- 
tacked by Professor Robert Winch, using meas- 
ures derived from Murray’s list of needs. My 
own perusal of his research reports (17, 18) sug- 
gests no conclusive findings for my problem, but 
if his personality ratings are free from contam- 
ination it seems clear that, within his sample of 
25 middle-class couples, traits or needs can be 
found with regard to which spouses are more 
likely to be different than alike—in particular a 
dimension labeled “assertive-receptive.” It is not 
possible, from Winch’s data (nor from any other 
data known to me), to estimate how much of the 
variance in marital selection can be accounted 
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for in terms of symbiotic personality needs. But 
it is surely a plausible notion that an individual 
with strong needs for assertiveness is more likely 
to find himself rewarded in this area of his life 
by interaction with a person who is receptive to 
his assertiveness than with one who is not. 

The most detailed of the analyses of socio- 
metric structures, especially those of Jennings 
(10), reveal analogous kinds of personality 
symbiosis; the over-chosen need the under-chosen, 
and vice versa. Many of the phenomena of 
choosing and accepting “leaders” (cf. 7) are also 
understandable from this point of view. 

There is another common notion about inter- 
personal attraction, to the effect that it varies 
with similarity, as such: birds of a feather flock 
together. It is not a very useful notion, how- 
ever, because it is indiscriminate. We have 
neither good reason nor good evidence for be- 
lieving that persons of similar blood types, for 
example, or persons whose surnames have the 
same numbers of letters, are especially attracted 
to one another. The ‘answer to the question, 
Similarity with respect to what?, is enormously 
complex—because similarities of many kinds are 
associated with sheer contiguity, for one thing. 
I shall therefore content myself with the guess 
(for which fairly good evidence exists *) that the 
possession of similar characteristics predisposes 
individuals to be attracted to each other to the 
degree that those characteristics are both ob- 
servable and valued by those who observe them 
—in short, insofar as they provide a basis for 
similarity of attitudes. 

Up to this point I have noted that we acquire 
favorable or unfavorable attitudes toward per- 
sons as we are rewarded or punished by them, and 
that the principles of contiguity, of reciprocal 
reward, and of complementarity have to do with 
the conditions under which rewards are most 
probable. From now on I shall be primarily 
concerned with a special subclass of reciprocal 
rewards—those associated with communicative 
behavior. 

The interaction processes through which re- 
ciprocal reward occurs have to do not with the 
exchange of energy but with the exchange of 
information, and are therefore communicative. 
I prefer the term “communicative behavior” to 
_ #Such evidence will be presented in my forthcom- 
ing monograph. 
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“social interaction” because it calls attention to 
certain consequences that are characteristic of 
information exchange, but not of energy ex- 
change, among symbol-using humans. The use 
of symbols, needless to say, involves the expendi- 
ture of energy, but—even in so obvious an ex- 
ample as that of receiving a slap in the face—it 
is the consequences of the information exchange 
rather than the energy exchange which interest 
us, as psychologists. 

I shall note two of these consequences, in the 
form of very general propositions—though each 
of them is in fact subject to very specific limita- 
tions. The first is this: communicators tend to 
become more similar to each other, at least mo- 
mentarily, in one or more respects, than they 
were before the communication. At the very 
least (assuming more or less accurate receipt of 
a message that has been intentionally sent), both 
sender and receiver now have the information 
that the sender wishes to call the attention of 
the receiver to the object of communication— 
i.e., that which the symbols symbolize. If we 
stipulate still further conditions, the proposition 
will apply to a wider range of similarity. Sup- 
pose, for example, that a person has just expressed 
an opinion about something—say the United 
Nations; to the degree that he is sincere, and 
insofar as the receiver trusts his sincerity, the 
communication (if accurately received) will be 
followed by increased cognitive similarity, to the 
effect that the transmitter holds the stated opinion. 
Now suppose we add a further stipulation—that 
the receiver not only trusts the sender’s sincerity 
but also respects his knowledgeability; under these 
conditions the opinions of sender and receiver 
are likely to be more similar than they were 
before. 

It is this last kind of similarity—i.e., that of 
attitudes—that has a special importance for the 
problem of interpersonal attraction. In fact, the 
proposition, as applied to similarity of attitudes 
toward objects of communication, has already 
introduced, as independent variables, certain di- 
mensions of attraction—namely, trust and re- 
spect. Change toward similarity in one kind of 
attitude following communication, I have as- 
serted, varies with another kind of attitude—i.e., 
attraction. 

My second proposition reverses this relation- 
ship: attraction toward a co-communicator (ac- 
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tual or potential) varies with perceived similarity 
of attitudes toward the object of communication. 
Before specifying the limited conditions under 
which this proposition applies, let me briefly pre- 
sent its rationale. 

While there are, of course, many exceptions, 
it is a highly dependable generalization that the 
life history of every human has made accurate 
communication rewarding far more often than 
punishing. Such is our dependence upon one 
another, from the very beginnings of communica- 
tive experience, and such is our indebtedness to 
culture, which is transmitted via communication, 
that success in the enterprise of becoming social- 
ized depends upon success in transmitting and 
receiving messages. Insofar as accurate com- 
munication is in fact rewarding, reward value 
will attach to the co-communicator—which is 
to say that positive attraction toward him will 
increase (other things equal) with frequency of 
accurate communication with him. Please note 
the qualification: “insofar as accurate communi- 
cation is in fact rewarding”; there are many mes- 
sages—e.g., “I hate you”—the accurate receipt 
of which is not in fact rewarding. 

If, as I have maintained, increased similarity 
in some degree and manner is the regular accom- 
paniment of accurate communication, it would 
be no surprise to discover that increased similarity 
becomes a goal of communication, and that its 
achievement is rewarding. And if, as I have also 
maintained, the reward value of successful com- 
munication attaches to the co-communicator, then 
it follows that the two kinds of reward effects— 
perception of increased similarity as rewarding, 
and perception of the co-communicator as re- 
warding—should vary together. This, in brief, is 
the rationale of my second proposition. 

It is, however, a very general statement, and its 
usefulness can be enhanced by a further specifica- 
tion of conditions. I shall mention only two of 
them. First, the discovery of increased similarity 
is rewarding to the degree that the object with 
regard to which there is similarity of attitudes 
is valued (either negatively or positively). The 
discovery of agreement between oneself and a 
new acquaintance regarding some matter of only 
casual interest will probably be less Tewarding 
than the discovery of agreement concerning one’s 
own pet prejudices. The reward value of in- 
creased similarity increases, secondly, with the 
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common relevance of the attitude object to the 
communicators. The success of a certain presi- 
dential candidate, for example, is likely to be 
seen as having consequences for both, whereas 
matters regarded as belonging in the area of per- 
sonal taste—like taking cream in one’s coffee— 
are viewed as devoid of common consequences. 
The discovery of similarity of the latter kind is 
not very likely to have much reward value. 

The thesis that interpersonal attraction varies 
with perceived similarity in regard to objects of 
importance and of common relevance is, from 
one point of view, opposed to the thesis of com- 
plementarity. In my own view, however, they 
are not in opposition; indeed, I regard the thesis 
of complementarity as a special case of similar- 
ity. Let me illustrate. Suppose, as Winch’s data 
may indicate, that an assertive person is more 
likely to be attracted toward a receptive than 
toward another assertive person, as a marriage 
partner. It is my guess that this would most 
probably occur if they have similar attitudes to 
the effect that one of them should be assertive 
and the other receptive. (Whether or not they 
use these words—and whether, indeed, they are 
able to verbalize the matter at all—does not 
matter.) In short, I am attempting to defend 
the thesis that interpersonal attraction always and 
necessarily varies with perceived similarity re- 
garding important and relevant objects (includ- 
ing the persons themselves). While I regard 
similarity of attitudes as a necessary rather than 
a sufficient condition, I believe that it accounts 
for more of the variance in interpersonal attrac- 
tion than does any other single variable. 

As the foregoing implies, and as I have else- 
where suggested (13), attraction and perceived 
similarity of attitude tend to maintain a constant 
relationship because each of them is sensitive 
to changes in the other. If newly received in- 
formation about another person leads to increased 
or decreased attraction toward him, appropriate 
changes in perceived similarity readily ensue— 
often at the cost of accuracy. And if new in- 
formation—either about the object or about an- 
other person’s attitudes toward it—leads to per- 
ceptions of increased or decreased similarity with 
him, then the direction or the degree of attrac- 
tion toward him easily accommodates itself to 
the situation as newly perceived. Change in at- 
traction is one, but only one, of the devices by 
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which some sort of tension state, associated with 
perceived discrepancy about important and rele- 
vant objects, is kept at a minimum. 

At the outset, I raised the question whether 
persons, as objects of attitudes, have properties 
that distinguish them from other objects. I 
ought now to acknowledge that I have already 
assumed that they do. I have been assuming 
that persons, as objects of attitudes, also have 
attitudes of their own—and, in particular, that 
they have (or can have) attitudes toward the 
same objects as do persons who are sources of 
attitudes toward the object-persons. Further, I 
have been assuming that object-persons have the 
same capacities for being disturbed by perceived 
discrepancies as do those who are attracted to- 
ward them. In degree, if not in toto, these are 
distinctively human characteristics, as G. H. Mead 
long ago noted (12), and any theory of inter- 
personal attraction that is at all distinctive from 
a general theory of attitudes must, I believe, pay 
homage to this fact. 

The remainder of this paper is devoted to some 
tests of specific predictions derived from the two 
propositions already presented, which may be 
telescoped as follows: Insofar as communication 
results in the perception of increased similarity 
of attitude toward important and relevant objects, 
it will also be followed by an increase in positive 
attraction. I shall therefore consider perceived 
similarity of attitude as a predictor of attraction. 
I shall also, for obvious reasons, be interested in 
actual, or objective, similarity. 

Since the findings which I shall present were 
obtained in a single research setting, I shall stop 
briefly to note the nature of that setting. I started 
with the research objective of observing the 
changing interrelationships, over time, between 
attraction and similarity of attitudes. Since it 
seemed important to start with a base line of 
zero, as far as attraction was concerned, it was 
necessary to find a population of persons who 
were complete strangers to each other. It also 
seemed desirable to provide a setting in which it 
would be possible for a high degree of positive 
attraction to develop, and in which regular and 
repeated observations could be made. All of 
these requirements seemed to be met by the fol- 
lowing arrangements. A student house was 
rented; male transfer students, all strangers to 
the University of Michigan, were offered the 
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opportunity (several weeks before their planned 
arrival at the University) of receiving free room 
rent for a full semester; in return they were to 
spend four or five hours a week in responding to 
questionnaires and interviews, and in participat- 
ing in experiments. Among those who submitted 
applications to live in the house under these con- 
ditions, 17 (the capacity of the house) were 
selected, no two of whom had ever lived in the 
same city, nor attended the same school. All 17 
men arrived within a 24-hour period, and all 
responded to a questionnaire within a very few 
hours thereafter. The men were given no voice 
in the selection of roommates, but (within the 
limits of University regulations) they were given 
complete freedom to conduct the house, includ- 
ing the cooking and eating arrangements, as they 
chose. The entire procedure was repeated, with 
a different but strictly comparable group, one 
year later. So far, however, the data have not 
been very fully analyzed, and unless otherwise 
noted the findings that I shall report are from 
the second year only. 

In this setting, data were obtained by ques- 
tionnaire and interview, at semi-weekly intervals. 
A wide range of attitude responses was obtained, 
as well as rather complete data concerning inter- 
personal attraction. Measures of the latter were 
derived both from responses to direct questions 
about how favorably each house member felt 
toward each of the others, and from reports by 
each about informal, freely associating subgroups 
of two or more. It turned out that there were 
some important differences between these two 
measures of attraction. The “General Liking” 
responses (as we labeled the former) were the 
more amenable to parametric measurement, and 
unless otherwise specified those findings that I 
shall mention here depend upon this measure. 
But the “clique” measure (as we came to call 
it) was probably the more valid index of attrac- 
tion for the purpose of testing many of our 
hypotheses, since it was based upon the reports 
of many observers having constant opportunity 
to notice who spent most time, and therefore had 
most opportunity for communication, with whom. 
The General Liking measure was probably the 
more sensitive toward the negative pole of at- 
traction, since a full sixth of all pairs received 
zero scores on the other measure; but toward the 
positive pole it was often a more valid index of 
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“admiration at a distance” than of direct con- 
tact and communication. 

I turn now to some specific predictions. First, 
if the basic generalization is correct, it should 
follow that, regardless of the content of com- 
munication, positive attraction will increase with 
opportunity for communication, other things 
equal, The only additional assumption involved 
in this prediction is that the likelihood of being 
rewarded by a co-communicator increases with 
opportunity for communication. I might add 
that there is nothing new about this prediction; 
it is, in fact, a restatement of our old friend, the 
principle of propinquity. Previous studies—e.g., 
by Festinger, Schachter, and Back (5) and by 
Deutsch and Collins (4)—have provided con- 
vincing support for it. 

Our own data give partial, but not complete, 
support for the prediction. Perhaps the best 
illustration of our findings that I can offer stems 
from an experimental “failure.” During our first 
project year, roommate assignments had, literally, 
been drawn from a hat. In planning for the 
second year, however, we decided to assign room- 
mates by experimental criteria, Half of the 
roommate combinations were therefore assigned 
in such manner as to insure (as we thought) that 
minimal attraction between roommates would re- 
sult, and maximal attraction in the other half of 
the combinations. (Our assignments were based 
upon data provided by mail, some weeks before 
the men arrived.) Our predictions received no 
support whatever; from the very beginning, and 
during each of the succeeding 15 weeks, the mean 
level of attraction between roommates—includ- 
ing those for whom we had predicted low attrac- 
tion—was higher than for all non-roommate pairs. 
It is also worth reporting that, at the beginning 
but not at the end of the semester, mean attrac- 
tion among all pairs living on each of the two 
floors of the house was higher than for all inter- 
floor pairs. During the final week, 90 per cent 
of all inter-roommate choices were in the upper 
three-eighths of all choices. 

These findings, as I have said, were obtained 
during our second year. Now I must report 
that, during the first year, the relationship be- 
tween attraction and room propinquity was noth- 
ing like so close. I shall not stop to give you 
the actual figures, but at the end of the semester 
inter-roommate attraction was only slightly higher 
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than that between non-roommates. This incon- 
sistency would be frustrating, indeed, if there 
were no other variables to which the differences 
could be related; after describing these other 
variables, I shall show that they account for much 
of this inconsistency with regard to proximity. 
Meanwhile, the proposition under consideration 
is that proximity, alone, cannot account for at- 
traction, but only to the degree that it facilitates 
the development of perceived similarity of atti- 
tude does it contribute to attraction. 

The remainder of my predictions, unlike the 
first, take into account the content of communi- 
cation. They are of the following general form: 
If and when increased attraction between pairs of 
persons does occur with opportunity for com- 
munication, it will be associated with increased 
similarity of attitude toward important and rele- 
vant objects. 

The first of these predictions is based upon the 
additional assumption that one’s self is a valued 
object to oneself. If so, then attraction should 
vary closely with self-other agreement about one- 
self. More specifically, insofar as a person’s pre- 
sumably ambivalent self-orientations are predom- 
inantly positive, his attraction toward others will 
vary directly with their attraction toward him. 
In testing this proposition, reciprocal attraction 
may be treated either as “objective” (i.e. as 
actually expressed by others toward the indi- 
vidual being considered) or as “perceived” (i.e., 
as that individual estimates that others will ex- 
press attraction toward himself). The latter pre- 
diction, however—that one’s attraction toward 
others varies with their perceived attraction to- 
ward oneself—seems almost untestable except in 
circular fashion; there are few ways in which it 
can be demonstrated in a “natural” situation, that 
attraction toward others is the dependent variable 
and that perceived attraction toward oneself the 
independent variable. 

Whatever the causal direction, our data show 
that an individual’s distribution of General Liking 
among his associates is related to their liking for 
him. The relationship is almost as close on the 
fourth day as at the end of the fourth month, and 
as a general tendency is highly significant, though 
there are individual exceptions. One can predict 
an individual’s liking for another individual with 
much better than chance accuracy if one knows 
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the latter’s liking for the former, at any time after 
the fourth day. 

The prediction will be a good deal more ac- 
curate, however, if it is made from an individual’s 
estimate of how well he is liked by the other. 
At any time from the second week on (when such 
estimates were first made), about three of every 
four estimates of another person’s liking for one- 
self were in the same half of the distribution as 
own liking for that other person. Median rank- 
order correlations were .86 at the end, and .75 at 
the beginning, between each man’s liking for each 
other man and his estimate of the reciprocals. As 
might be expected, this relationship was especially 
close at the extremes; 5 out of 6 predictions of 
liking for other persons would be in the correct 
quarter of the distribution, if based only upon 
subjects’ estimates that they are in the highest 
or lowest quarter of reciprocated liking. Such 
findings correspond closely to those previously 
reported by Tagiuri (/6). 

Apparently the close relationship between Gen- 
eral Liking and its estimated reciprocal is but 
slightly influenced by communication. At any 
rate, the relationship does not increase signifi- 
cantly from near-strangership to close acquaint- 
ance, nor is the relationship significantly closer for 
roommates, at the end of the four-month period, 
than for non-roommates. Neither, as a matter of 
fact, does accuracy in estimating reciprocal liking 
increase with further acquaintance, for most sub- 
jects. Estimates of others’ liking for oneself are 
so closely correlated with own liking for those 
same persons (the relationship approaches the 
self-correlation of either measure, at any given 
time), that most of the variance of either can be 
accounted for by: the other. Whatever influences 
either of them influences both in about the same 
way. 

These facts—that perceived reciprocation re- 
mains closely tied to own liking without increas- 
ing in accuracy over time—do not mean that 
estimated reciprocation is purely autistic. On the 
contrary, it tends to be quite accurate, differing 
from chance distributions at beyond the .001 level. 
Two of every three estimates, at all times, are in 
the correct half. What these facts do mean, ap- 
parently, is that both attraction toward others and 
its estimated reciprocal are jointly determined by 
autistic and by “realistic” factors, in such manner 
as to remain closely bound together in a relation- 
ship that does not change over time. I believe 
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that a clue to the manner of interaction between 
autistic and “realistic” influences is provided by 
the following additional fact. Without exception, 
the men whose liking status rose with time either 
became more accurate in their estimates of re- 
ciprocation or maintained the earlier degree of 
accuracy, while those whose status declined tended 
to become less accurate. Our subjects had no 
difficulty in adapting, realistically, to the fact of 
rising sociometric status, but acceptance of declin- 
ing status was only partial. All subjects distrib- 
uted about the same range of liking scores, but 
each tended to receive a distinctive distribution. 
Estimated reciprocals represent a compromise be- 
tween own liking for the individual in question 
and amour propre. 

The proposition that perceived similarity in 
valuing the self contributes heavily to variance in 
attraction, together with the assumption that self- 
valuation tends to remain high at all times, is thus 
well supported. All persons, at all times, are liked 
according as they are judged to agree with oneself 
about oneself. These judgments become more 
accurate over time to the degree that one’s actual 
changes in status make it possible to judge them 
accurately and at the same time continue to be- 
lieve that one’s own likings are reciprocated. For 
those who are discovering that their actual status 
is relatively low, the conflict—or, more specifi- 
cally, the strain of perceived discrepancy—thus 
aroused is reduced at the cost of accuracy. 

I have already implied that attraction is hypo- 
thetically predictable from cognitive as well as 
from cathectic similarity regarding objects of 
importance. I shall present findings concerning 
cognitive similarity regarding only one kind of 
object—persons. Each subject was asked to de- 
scribe himself as well as the other house members 
by checking adjectives drawn from a list prepared 
by Professor Harrison Gough (8). Each was also 
asked to describe his “ideal self,” by using the 
same list, and to describe himself as he thought 
other house members would describe him. By 
comparing these responses with self-descriptions, 
we obtained measures of perceived similarity re- 
garding the self. (This work closely parallels that 
by Fiedler (6) concerning “assumed similarity.”) 

Attraction turns out to be closely related to per- 
ceived agreement (at considerably less than the 
.001 level). When the same data are analyzed 
individually, only two of 17 subjects fail to show 
the relationship in the predicted direction, and 
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only one of these reverses it. This finding is more 
impressive than it would be if it resulted from 
attributing only favorable judgments of oneself 
to high-liked others, and only unfavorable judg- 
ments to low-liked others. Actually, eight of the 
ten subjects who accepted unfavorable adjectives 
as describing themselves, and who indicated that 
one or more others agreed with them, showed 
more agreement in these unfavorable descriptions 
with high-liked than with low-liked others. The 
relationship between attraction and perceived 
agreement on favorable items is, not surprisingly, 
a good deal closer. At any rate, the finding that 
attraction varies with perceived cognitive agree- 
ment about the self is not merely an artifactual 
result of the common-sense assumption that one 
is attracted toward those who are believed to 
think well of one. Judging from our data, it is 
also true—and perhaps contrary to common sense 
—that we are attracted to those whom we per- 
ceive as seeing both our foibles and our virtues 
as we ourselves see them. Many psychothera- 
pists, I am told, can readily confirm this observa- 
tion. I believe, by the way, that the patient’s 
perception of converging attitudes toward himself, 
by himself and therapist, has much to do with the 
phenomena of positive attraction in “transfer- 
ence.” # 

My next prediction deals not with the self as 
object of attitudes but with other house members. 
Of all the objects about which we obtained re- 
sponses, nothing compared in importance or in 
group relevance with the house members them- 
selves. Very early they became differentiated in 
attraction status, so that it was easy to measure 
similarity, on the part of any pair of persons, in 
attraction toward the remaining members. Cor- 
relations were calculated between the attraction 
scores of each member and those of each other 
member (there were 136 such pairs, each year) 
toward all of the other 15 members; this was done 
for each of the 16 weeks that the group lived 
together. Thus the proposition could be tested 
that the greater the similarity between any two 
members in assigning General Liking scores to the 

8 Dr. Keith Sward, in particular, has called this to 
my attention. 

4Cf. Rogers (15, pp. 66-96) for empirical evidence 
to the effect that, in at least one case of successful 
psychotherapy, the correlation between the patient’s 
self-sort and the therapist’s description of the patient, 
by a sorting of the same items, increased over time. 
I do not know of other data on this point. 


other 15 members, the higher their attraction for 
each other. A related prediction is that this rela- 
tionship will increase with communication—that 
is, with time. 

Both propositions receive clear support, accord- 
ing to both criteria of attraction. On the fourth 
day the relationship between within-pair General 
Liking and within-pair correlation of General Lik- 
ing for remaining members is barely significant, 
and only slightly higher a week later. It increases 
fairly steadily till, at the end of four months, two- 
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thirds of all within-pair attractions would be cor- 
rectly placed in the upper or lower half of the 
distribution, judging only from the fact of being 
in the upper or the lower half of the distribution 
of correlations. This finding emerges more 
clearly by comparing the mean within-pair corre- 
lations for various categories of within-pair attrac- 
tion, as shown in Fig. 1. 

Individuals in high agreement with each other 
about the other 15 house members clearly tend 
to be attracted to each other. The opposite tend- 
ency is much less pronounced; none of the cate- 
gories involving subjects in the lower eight ranks 
has a mean correlation much below the averag? 
of the total set of pairs. The lowest of all the 
mean correlations (shown by the “X” in Fig. 
1) is that of all pairs of which one member— 
and only one—is in the lowest quarter of attrac- 
tion (ranks 13-16). For these 44 pairs the mean 
correlation is .35—not significantly different from 
zero. Thus, the correlations predict not only to 
within-pair attraction but also (particularly at the 
extremes) to interpersonal mutuality, regardless 
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of level of attraction; the relationship between 
them, as calculated by X?, is in fact significant at 
the .001 level. 

Though it has, in general, proven easier to pre- 
dict to high than to low attraction, those lowest 
in our house totem pole deserve a paragraph. 
The lowest three in our second-year group were 
truly rejected (according to objective criteria 
which I cannot stop to specify); they were liter- 
ally disliked as none others were. (The next low- 
est two, on the other hand, were near-isolates, 
who were withdrawn and more or less ignored but 
not generally disliked.) All six of the attraction 
responses given and received within this set of 
three rejects were among the lowest possible three 
ranks, their average being exactly 15, when 15.5 
is the lowest possible average; they were liked by 
each other even less than others liked them. At 
the same time, the three intra-pair correlations 
among these three rejects are slightly above the 
average for the entire group of subjects, and .7 
sigmas above the mean correlation for the same 
individuals with all others except the rejects them- 
selves (.52 as compared with .39). In short, they 
disliked each other but tended to agree with each 
other about the remaining individuals more than 
they agreed with the remaining individuals. This, 
of course, is very perverse of them, and it is 
tempting to conclude that such wilful thwarting 
of my favorite hypotheses is all of a piece with 
their personalities, as rejected persons. I shall 
content myself, however, with suggesting that 
these three rejects developed a special set of 
standards: personal inoffensiveness in others was 
highly valued. If such standards did indeed exist, 
I believe they were developed by each of the three 
men in relative independence of the other two. 
They disliked each other too much to be very 
much influenced by each other. Such agreement 
as there was among them concerning the remain- 
ing men occurred, we know, without benefit of 
much communication, and it is well to be re- 
minded that attitudinal similarity can occur on 
the part of individuals in the same predicament 
facing the same objective world, quite independ- 
ently of one another’s influence. 

Since these two predictors (estimated recipro- 
cation and within-pair agreement) are far from 
perfectly correlated (their relationship is indicated 
by a contingency coefficient of .60), one may ask 
about their comparative and their combined pre- 
dictive power. The statistical breakdowns will 
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eventually be published, and so I shall not present 
them here. The fact is that if one merely wishes 
to pin-point the individual instances of high at- 
traction, the estimated reciprocal, alone, is the 
most successful of all predictors; 97 per cent of 
the highest quarter of attractions are selected by 
the criterion of the upper half of the estimated 
reciprocals, But if one wishes to account for 
maximum variance, and at both ends of the dis- 
tribution, the combined criteria are better than 
either alone. As indicated by a coefficient of 
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Fic. 2. Per cent of 256 attraction scores selected, 
at each of four levels, by joint criteria (estimated 
reciprocals and within-pair correlations). 


contingency of .53 between the combined pre- 
dictors and actual attraction scores, almost one- 
third of the variance in attraction is thus ac- 
counted for. As shown in Fig. 2, high attrac- 
tion is particularly well predicted by the joint 
criteria; virtually none of those predicted as high 
are in fact in the lower half of attraction scores. 

These findings, based upon small numbers, 
would be subject to much suspicion were they not 
perfectly consistent. Whether by very loose or 
by very restrictive criteria, the predicted relation- 
ships emerge; the more restrictive the criteria, the 
greater the excess over chance expectations. 

At a theoretical level, I consider it highly sig- 
nificant that these two predictors, the combined 
effects of which are more successful than either 
alone, include one subjective index (estimates of 
reciprocal attraction) and one that is objective, 
in the sense of describing a relationship between 
a pair of persons and not referring to either per- 
son alone. Theoretically speaking, this is as it 
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should be. Doubtless most forms of social be- 
havior, like attraction, are jointly determined by 
individual characteristics and by relationships to 
others—relationships which pertain to the recipi- 
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ent of behavior quite as much as to the behaver 
himself. 

Now let me return, briefly, to our finding that, 
in one year but not in the other, the mere fact of 
being a roommate accounted for much of the 
variance in the development of attraction. I have 
already implied that propinquity is a facilitator 
but not a sufficient condition for the development 
of positive attraction. It should follow, therefore, 
that attraction between roommates will be rela- 
tively high only insofar as their propinquity con- 
tributes to the development of one or more of the 
conditions favorable to high attraction. This is 
exactly what our data show: roommates scored 
much higher on both predictor variables during 
the second year than did non-roommates, but not 
during the first year. 

As shown in Figs. 3 and 4, the year-to-year 
differences in the relationship between attraction 
and room proximity are paralleled by comparable 
differences in the relationship between proximity 
and one of the predictor variables, namely, within- 
pair correlation of attraction toward the other 
members. Roommates differ from all others by 
one full standard deviation at the end of the 
second year, but by only one-fifth of a standard 
deviation at the end of the first year. According 
to the other predictor variable, perceived recipro- 
cality of liking, the differences are of exactly the 
same order, and the curves are correspondingly 
parallel. , 

It seems likely, therefore, that proximity con- 
tributes to attraction only by way of the predictor 
variables. As to why room proximity facilitated 
the development of the predictor variables in such 
a way as to lead to high roommate attraction in 
one year but not in the other, I can only say that 
I have some reason to believe that more detailed 
analysis will provide at least partial answers. 

You are doubtless wondering about the gener- 
ality of the proposition that attraction is predic- 
table from similarity of attitude toward important 
and relevant objects, since the only objects that 
I have mentioned, so far, are persons. Although 
our analyses are far from complete, they indicate 
that the proposition also applies to objects other 
than persons, though at lower levels of confidence. 
But it is already clear that, in this research setting, 
there were no objects which compared in rele- 
vance, for all members, to house members them- 
selves. We sampled a range of attitudes that 
extended virtually from cabbages to kings; there 
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were several pairs of subjects for whom kings (or 
at least presidents) were highly relevant, and there 
may have been some whose within-pair attraction 
was influenced by attitudes toward cole slaw. 
There were, however, no single non-person ob- 
jects of sufficient relevance for all members to 
account for very much variance in the attraction 
level among all pairs. 

One way of describing this complication is to 
note that our subjects knew so much about so 
many of each others’ attitudes that no single one 
was crucial for all pairs. This predicament is 
well illustrated by a series of experimental find- 
ings. On several occasions, outsiders were 
brought in to present a point of view on a contro- 
versial topic; our subjects’ General Liking for 
these speakers, about whom they knew nothing 
apart from the one topic, was (as predicted) 
closely correlated with perceived agreement with 
them. Perhaps the moral to this story is that, if 
one wants uncomplicated findings, one should 
stick to brief, laboratory-like, rather than to long- 
term, “natural,” situations. 

There were two ways in which we were able, 
nevertheless, to show relationships between at- 
traction and similarity in attitude toward non- 
person objects. The first of these was by regard- 
ing highly generalized values as objects. For 
example, agreement in Allport-Vernon scores was 
related to attraction, for the total population of 
136 pairs; the significance levels ranged from .05 
to .01, depending upon the exact measures of each 
variable. If Osgood’s three-dimensional measure 
of meaning structure (/4) may be regarded as a 
highly generalized attitude, of both cognitive and 
cathectic nature, toward things-in-general, then 
the results of using this measure are also relevant. 
“Semantic harmony,” derived from responses to 
a wide range of stimulus words (e.g., father, poli- 
tics, sex, money), was significantly related to at 
least one of our measures of attraction, for all 
136 pairs. 

Our second approach was to take as an index 
of attitude similarity the number of non-person 
objects about which there was a given degree of 
similarity, rather than the degree of similarity 
regarding a single object. This index was related 
to attraction, for at least one of our two sets of 
subjects, though not, apparently, at significance 
levels below .05. This was one of the few meas- 
ures, by the way, of pre-acquaintance similarity 
which successfully predicted, among all pairs, to 
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later attraction. If, as appears to be the case, its 
predictive value tended to increase with time, this 
finding would be consistent with the assumption 
that, over time, our subjects tended to sort each 
other out as they gradually discovered one an- 
other’s attitudes on a wide range of issues. 

I have two brief and final comments concerning 
the significance of findings such as I have been 
presenting. First, as to the very limited setting 
in which they were obtained, there is no reason 
to believe that the particular students whom we 
happen to have studied differed very greatly from 
other groups of young-adult peers, in the kinds 
of relationships here reported, at comparable 
stages of acquaintance. Indeed, it is likely that 
the very fact of their homogeneity in regard to 
age and sex and student status tended to reduce 
the variance of many of their attitudes; if so, at 
least some of the predictors here reported would 
prove still more satisfactory with more varied 
groups. I feel, therefore, that I am not grossly 
over-extending the application of my own findings 
when I report, with considerable confidence, that 
the conditions under which attraction develops 
and changes or remains stable are orderly ones. 
It is possible, moreover, to formulate statements 
of these conditions into a consistent body of prop- 
ositions. 

Secondly, as to the common-sense nature of 
much that I have reported, none of you has been 
overcome with astonishment on learning, for ex- 
ample, that our subjects tended to like those by 
whom they thought they were liked, or by those 
who, they thought, would describe them in most 
favorable terms. My concern is not so much to 
point out that some of our findings are unexpected 
—e.g., that perceived agreement with others con- 
cerning one’s own unfavorable traits is a reason- 
ably good predictor of positive attraction. Nor 
is it to repeat the ancient truism that no one 
knows whether what every one knows is true is 
really true until it has been properly tested. 
Rather, I want to note that several different prop- 
ositions (some conforming to common sense and 
some not), which superficially have nothing to do 
with one another, are derivable from the same set 
of assumptions. 

The fact seems to be that one can predict to 
interpersonal attraction, under specified condi- 
tions, from frequency of interaction, from the 
perception of reciprocated attraction, from cer- 
tain combinations of personality characteristics, 
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and from attitudinal agreement. There is no self- 
evident reason why such diverse variables, viewed 
common-sensewise, should belong together; one 
might almost suspect that they had been drawn 
out of a hatful of miscellaneous variables. But 
predictive propositions about those variables all 
flow, as I have tried to show, from a very few 
psychological assumptions. I believe the conflu- 
ence to be both theoretically required and em- 
pirically supported. These considerations seem to 
me to lend confidence to the point of view that 
a limited theory about a limited class of objects— 
namely, persons—can profit by taking account of 
the significant properties of those objects, and in 
particular those properties closely related to the 
fact of human dependence upon communication. 

You may remember an old story whose punch 
line is “Vive la difference”—Thank God for the 
little difference. If we are inclined to take a 
favorable view of positive interpersonal attraction, 
perhaps we should also be grateful for similarities: 
Vive la similarité! 
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Who 


smokes too much? Who can stop smoking? Re- 
viewing the sparse literature on smoking and per- 
sonality, we could find neither descriptive data 
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answering these questions nor good theory from 
which the answers might be deduced. Therefore, 
we have made a first attempt at delineating a 
psychology of smoking. 

Our basic data come from The Study of Adult 
Development, once known as the Grant Study. 
That study has been described elsewhere (6). 
Our subjects were a panel of 252 Harvard alumni 
who were selected during their sophomore years 
for lack of visible abnormality. During their 
sophomore years, which fell between 1938 and 
1942, they were studied by a great range of medi- 
cal, physiological, psychological, anthropological, 
and sociological techniques. We have been fol- 
lowing these men since by annual questionnaire, 
retesting, and visits. Questions about smoking 
habits were routinely asked in the questionnaires 
during the fifteen or so years the men have been 
followed. This very full material permitted the 
study staff to make comparisons between smoking 
habits and hundreds of variables. Only the find- 
ings relevant to psychosocial theory are reported 
here.? 


WHO DOESN’T SMOKE? 


A striking number of our subjects did not 
smoke. When the men were first studied, as 
sophomores, 108 of them (some 45 per cent) 
had not begun to smoke at all. Sixty-one never 
have. That is about 24 per cent of the group. In 
any one year, more than 40 per cent of the men 
will be “off” smoking, though for some this is very 
temporary. 

Our theory begins with Bales’ very general 
propositions about the origins of any compulsive 
habit. Bales (2) began by studying compulsive 
drinkers. He found that “analysis shows the 
craving to be a result of at least two types of un- 
derlying elements: (1) some need or complex of 
needs .. . and (2) an orientating structure of 
habitual thought patterns and associated emo- 
tional justifications.” Orientation has to come 
first; if the experience is found gratifying, it may 


2Some of the other findings of the study staff, 
especially those in relation to medical and physical 
variables, are reported in Heath, C. Differences be- 
tween Smokers and Non-Smokers, in press. A few 
items appear in both papers, as they seem relevant. 

A table, giving data for each of the findings re- 
ported, has been deposited with the American Docu- 
mentation Institute. Order Document No. 5464, re- 
mitting $1.25 for 35 mm. microfilm or $1.25 for 6 
by 8 in. photocopies. 
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be put into the service of needs. As Bales (3) 
says, “it is during the formation of the compul- 
sive habit that the influence of the society and its 
culture come to a critical focus in the individual 
person and work their effect.” In particular, it is 
society that determines whether the new habit 
seems to be “a means of relieving . . . inner ten- 
sions, or whether such a thought arouses a strong 
counteranxiety.” 

Convergent theories of anthropologists, sociolo- 
gists, and psychologists, reviewed and empirically 
confirmed elsewhere (10, 13, 14, 15), all suggest 
that the use of tobacco to resolve major personal 
needs might “arouse a strong counteranxiety” in 
upwardly mobile members of America’s lower 
middle class. It seems likely that this group dif- 
fers in their attitude toward smoking from people 
who stand both above and below them in the so- 
cial hierarchy. Their attitude to smoking would, 
presumably, be corollary of a basic value system 
that these people hold in contrast to most of con- 
temporary society. 

The source of this value system is described by 
Weber (23) as the Protestant Ethic. It is a work 
morality, emphasizing the Devil’s stake in idleness 
and self-indulgence. It is an ethic that abhors as 
Sin the wasting of one’s substance in unprofitable 
trivia, since such waste, in the original protestant 
dogmas, was evidence that a man had not been 
vouchsafed Grace. This ethic was, as Mills (78) 
remarks, a morality of producers, who abhorred 
consumers. It is typical that, after one especially 
long Calvinistic debate, tobacco was settled upon 
as a good thing because the growing of it produced 
profit. 

This theology of “provident purposes” was soon 
secularized in the form of “worldly asceticism,” as 
Weber illustrates from the diaries of Benjamin 
Franklin. Indeed, the process of lay morality 
descending from religious dogma is still going on, 
even among the Harvard group we are studying, 
as Allport (J) has shown in his study of religion 
in the postwar college student. It is thus that 
the traditional middle-class values were gener- 
ated. 

These values are now being replaced in our 
society, according to sociologists. They have be- 
come the property, though perhaps not exclusively, 
of the “Old Middle Class.” They belong to Ries- 
man’s (20) “Inner-Directed” group, so soon to be 
replaced by the Outer-Directed; to Mills’ (18) 
“cheerful robots,” so soon to become equally 
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cheerful mass consumers. Yet it is these values 
that have generated what Mead (16) calls “the 
American core culture” and what Kluckhohn (70) 
describes as the dominant value profile in Ameri- 
can culture and expects to find focussed in the 
middle class, and which McArthur (13, 14) has 
empirically shown to be focussed in the lower 
middle. It still gives rise in the American mid- 
dle class to a new generation instilled with the 
“achievement mores” whose operation in a con- 
temporary urban setting is described by McClel- 
land (15). Children reared in this tradition are 
mobile, following what Miller and Form (17) 
describe as Ambitious Careers. As one study 
(14) shows, “The Future, Doing-oriented family 
must produce sons reared in the ‘achievement 
mores,’ taught to look forward to by-passing or 
surpassing their fathers’ occupational roles... . 
It is the hopes of the mother that these sons must 
realize in order to feel successful. They . . . will 
have introjected her precepts. It is these boys 
who will, after college, be expected to leave the 
family and ‘make their own way.’” Children 
reared in this tradition are still (75) weaned and 
trained early, encouraged to be ambitious, hard- 
working, and clean-minded. “I know that Andy 
has a clean mind,” wrote one father of a non- 
smoker, “he does not smoke, he does not loiter 
on the street with gangs, he has been brought up 
to look for the fine things in life and only by hard 
work and perseverance can they be achieved and 
enjoyed.” 

In the study, we have found (73) that a simple 
empirical tool for bringing out middle vs. upper 
class contrasts is to compare Harvard men who 
prepared for college in public high schools with 
Harvard men who prepared for college in private 
Preparatory schools. The high-school graduate 
who goes to Harvard has usually received a heavy 
dose of “work morality” during his rearing. As 
an earlier study (74) suggested, “His role in the 
family often is that of standard-bearer, the son 
who is to win status or ethnic mobility.” The 
prep-school boy who goes to Harvard is more 
likely to have been bred on a code of acceptable 
behavior. Tradition sets bounds on him in terms 
of “what is done,” not in terms of “worldly 
asceticism” as a means of “getting ahead.” If a 
concentration of nonsmokers exists in our data, 
then we may well predict that it will focus among 
our public-school graduates. 

This is the case, In a steady progression with 
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status, 20 per cent of the graduates of very ex- 
clusive private schools do not smoke, 30 per cent 
of the graduates of less exclusive private schools, 
and 40 per cent of the graduates from public 
schools. This pattern is significant at the .01 
level. Part of the explanation of this fact may be 
that the prep-school boys learned to smoke in 
their dormitories (rules notwithstanding) while 
the high-school graduate did not experience dor- 
mitory life with its peer-orientation until coming 
to college. Some facts fit this notion. In retro- 
spect, about three-quarters of the private-school 
graduates who smoke say that they learned to 
smoke in prep school or at the time of prep 
school; about three-quarters of the public-school 
graduate smokers say they learned to smoke in 
college. As one progresses from private boarding 
to private day to public day schools, there is a 
significant (.01) decline in the proportion of boys 
who already smoked at the time they were taken 
into our study. However, there is a status differ- 
ence operating together with this dormitory fac- 
tor. If we hold boarding constant, there remains 
a significant (.02) difference between the more 
exclusive and the less exclusive schools. Such a 
pattern may well have been generated in the fami- 
lies who elect to and are able to send their sons 
to these schools, rather than in the schools them- 
selves. Besides, the tacit other half of the dormi- 
tory argument is that high schools do not orient 
their pupils toward smoking. This cannot be said. 
Hollingshead (8) tells us that in “Elmtown” high 
school, “practically all boys (91 per cent) smoke, 
irrespective of age or (social) class.” Our pub- 
lic-school nonsmokers are therefore somehow spe- 
cial. It may be supposed that their rearing for 
mobility, in terms of the old ethic described ear- 
lier, may have created a considerable “strong 
counteranxiety” to undermine the orientation from 
their peer culture toward learning to smoke. 


In Elmtown, 

Law and the mores deny high-school students 
the right to enjoy the pleasures derived from 
tobacco, gambling, and alcohol. However, the 
mystery with which adults surround these areas 
of behavior lends them a special value which 
seems to act as a stimulus to many young peo- 
ple who desire to experience the supposed t! 
of pleasures their elders deny them. The con- 
spiracy of silence which is an essential part of 
the clandestine violation of the mores has al- 
ready taught them how easy it is to avoid Te- 
strictions imposed by law and taboo if they are 
discreet about how, where, and under what 
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circumstances it is done. Acquisition of knowl- 
edge of the means of transgressing against al- 
cohol, tobacco, and gambling taboos without 
being caught and the thrill of violating these 
taboos take place for the most part in the 
clique. 
Presumably any peer-oriented youngster is likely 
to smoke, whatever his social status, as part of a 
kind of defiant claim on adult status. It may be 
supposed that the more earnest students in high 
school take adult prohibitions more seriously and, 
perhaps, do not always fit well with “the clique.” 
Theirs may be a more compliant claim on adult- 
hood. Perhaps among the earnest nonsmoking 
nine per cent of Elmtown seniors, there was one 
who applied to Harvard. More went to other 
colleges and were mobile into white collar techni- 
cal and professional occupations. It is not to be 
supposed that attendance at any one university is 
a potent cause. Census data (5) show a nation- 
wide increase of nonsmokers when professional 
and technical people are compared with other oc- 
cupational groups. Nonsmoking in America is 
a social class phenomenon. 

It is in England, too. A Hulton survey (7) 
suggested that nonsmoking was slightly commoner 
among men of the English middle class while 
heavy smoking was commoner among English 
working-class men. An excellent market survey ® 
in England not only confirms this difference in 
rate of smoking but also documents in many ways 
the fact that nonsmoking is widely perceived as a 
mark of “middle-class respectability.” 

Our data certainly suggest that smoking is 
popularly defined as faintly disreputable or, at 
least, one of the small vices. Nonsmokers are sig- 
nificantly often (.01) nondrinkers; indeed, it 
seems that, as Straus and Bacon (22) report from 
Yale, drinking precedes smoking as the student 
takes up adult foibles. Our nonsmokers are also 
significantly often (.01) people who do not drink 
coffee. 

We may expect some orientation toward non- 
smoking to have been taught these men by their 
religions. The kind of piety that Weber describes 
should spill over into self-denial, even in the ab- 
sence of a specific theological proscription. Pro- 
scription of smoking sometimes does occur. 


8 Davis, F. Cigarette Smoking Motivation Study. 
This is an unpublished monograph prepared by Re- 
search Services Limited, to whom we are grateful for 
access to it. 


Among Protestant denominations, it is usually 
those of less social standing (19)—e.g., Funda- 
mentalists—who condemn smoking, while those 
of more fashionable status—e.g., Episcopalians— 
do not. The lower-middle-class focus of non- 
smoking may be thus overdetermined. At any 
rate, we may find overlap among smoking habits, 
social status, and piety. 

Earlier in the course of the study, Heath had 
set up ratings of the devoutness of the participants 
and also of their families. These ratings were 
based on considerable information about theo- 
logical beliefs, church attendance and activities, 
personal use of prayer, etc. (No attention was 
paid to vices, great or small, and hence smoking 
in itself did not enter this rating.) The boys who 
did not smoke when they entered the study had 
more devout parents (.05). They were also more 
likely (.01) to attend church while at Harvard, 
despite the absence of any chapel requirement. 
Lifelong nonsmoking is associated even more 
strongly (.02) with devoutness of parents. It 
would seem that the individual lastingly introjects 
his smoking morality! Indeed, the nonsmokers 
are often rated higher on devoutness than their 
parents. 

This introjected piety seems to be more an indi- 
vidual matter than we expected. In the study 
data, nonsmoking is the modal pattern for the 
members of no one religious denomination, Nor 
did the interaction between piety and status work 
quite as expected. It is true that, as expected, 
most Protestant families with low incomes were 
rated high on devoutness and produced sons who 
were nonsmokers. Presumably, we see here the 
introjection of pietistic standards, at least in the 
form of “worldly asceticism,” in the absence of 
theological proscription. 

The nonsmokers may represent survivals not 
only of the protestant ethic and middle-class 
morality but also of the related phenomenon Ries- 
man (20) calls “Inner Direction.” For example, 
in answering the Strong Vocational Interest Blank, 
more sophomore nonsmokers (.03) said they pre- 
ferred “nights at home” to “nights away from 
home,” and smokers said the opposite. Also, 
nonsmokers preferred (.02) “belonging to few 
societies,” smokers “belonging to many societies.” 
These trends were especially marked among non- 
smokers from public schools, whose earnest indi- 
vidualism is thus reaffirmed, but private-school 
nonsmokers showed some of the same trend. 
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A curious sidelight may be interpreted vari- 
ously: more sophomore smokers (.01) say they 
would prefer “preparing the advertising for the 
machine,” in the appropriate Strong item, while 
nonsmokers prefer not to. It is also true (.05) 
that more smokers responded “Like” to the occu- 
pation of Advertiser, nonsmokers responding “Dis- 
like.” Perhaps the nonsmoker is so Inner Di- 
rected that he does not have his “radar” tuned to 
the latest fad in mass consumption. He is, per- 
haps, not yet part of the consumer ethic that is 
said (18) to be appearing now among the middle 
class. 

A related finding, based on Strong-like items 
in a recent questionnaire, is that smokers respond 
“Like” to the occupation Sales Manager (.01) 
but “Dislike” to the occupation Scientific Research 
Worker (.01). This attitude is reflected in career 
choices: the nonsmokers contribute more physical 
scientists (.01). These facts overlap the status 
differences already observed; few private-school 
boys go into science but many go into business. 
Whether the ethos of the physical scientist over- 
determines his nonsmoking over and above the 
effect of his middle-class origins and upward mo- 
bility is not clear from our data. Certainly the 
two—mobility and science—when combined pull 
powerfully: of the twelve public-school boys from 
low-income families who became scientists, ten 
never smoked and one tried it but stopped. 

In summary, the nonsmoker seems to have been 
oriented by the mores of a particular American 
subculture. He is often of lower-middle-class 
origin and himself upwardly mobile. He shows 
the “worldly asceticism” that has stemmed from 
the old Protestant ethic. Often he is pious, per- 
haps more so than his parents. It seems likely 
that he has reacted to smoking as being one of the 
“small vices” to which the flesh is heir. He is, at 
any rate, an Inner-Directed person, introjecting 
the morals of his youth, perhaps a serious sort, 
and maybe an introvert. He does not go along 
with the suggestions for a consumer morality 
offered by the mass media. He approves scien- 
tific rather than business values and may often 
himself be a scientist or engineer. Just what 
causes underlie this web of correlations is not 
clear, but existing theory about the ethos of the 
Old Middle Class or about the American core 
culture seems relevant. The standards this man 
has introjected furnish sufficiently “strong counter- 
anxiety” to prevent his sharing the orientation 
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toward smoking that seems to be common to all 
the rest of American society both below and 
above him. 


WHO SMOKES HEAVILY? 


Whether a man smokes or not seems best ex- 
plained by his social orientation. Whether he 
becomes a heavy smoker seems best explained by 
his personal needs. Bales (2) speaks of the ef- 
fectiveness, once a man has been oriented toward 
adopting a habit, of “the degree to which the 
culture operates to bring about acute needs for 
adjustment, or inner tensions, in its members. 
There are many of these: culturally-produced 
anxiety, guilt, conflict, suppressed aggression, and 
sexual tensions of various sorts may be taken as 
examples.” 

There is no doubt that our very few really 
heavy smokers, who meet the criterion for Heavy 
Smoking now customary in medical research— 
“two packs a day for several years”—have more 
than their share of “acute needs for adjustment.” 
Most have had marital problems, some quite dra- 
matic. All are given to impulsive acts, some to 
physical violence, if only in the form of volun- 
teering for dangerous missions. Several are hard- 
driving, tough competitors. None are usual for 
our group. As one observer phrased it, “they are 
men who live in overdrive!” Their stories are 
told by Heath.* 

That the need to smoke heavily might be gen- 
erated by anxiety is, of course, a common-sense 
hypothesis. Our data warns us to view it with 
caution. We have not been able to correlate 
changes in smoking with changes in the tenseness 
of a man’s life situation, in spite of the “clinical 
hunch” of staff members that this relationship 
held. Perhaps anxious smoking is episodic, SO 
that our yearly questionnaire is too coarse & 
measure of it. Perhaps something is to be learned 
from the answers to a question about the symp- 
toms of stress that our participants felt when 
under pressure. Half of the smokers said they 
“smoked more” in these circumstances. The in- 
teresting fact, though, is that 70 per cent of the 
heavy smokers but only 30 per cent of the light 
smokers said this. Is smoking, then, a suitable 
tension reducer only after it has become a firmly 
ingrained habit? One thinks of learning theory: 
a response must be “high in the habit hierarchy” 


4 Heath, C. Differences between Smokers and Non- 
Smokers, in press. 
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before it has the “availability” to reduce tension. 
Or one may think that this little item is a rather 
pretty confirmation of Bales’ contention that only 
after thorough orientation can a compulsive habit 
be used to satisfy anxious needs. 

The most striking correlation between heavy 
smoking and personality is found in our inkblot 
test material. In the early days of the study, Dr. 
Wells (24, 25) used an abbreviated Rorschach 
procedure, allowing one minute for response to 
each of the standard blots, which he called the 
Timed Rorschach. The resulting scores are not 
identical with Rorschach scores but seem to be 
interpretable by similar theory. The Rorschach 
variable we concentrated on is the Experience 
Balance, defined as the ratio of the number of 
human movement responses to the weighted sum 
of the various kinds of color responses, If both 
members of this ratio are equal to or less than 
2:2, one speaks of a “coarctated” Rorschach, by 
which one means an emotionally narrow, drily 
factual, excessively controlled performance. If 
either member of the ratio exceeds 2, one sort or 
another of emotional expression has taken place, 
and the label “coarctated” no longer applies. 

Heavy smokers produce more (.05) coarctated 
records. (Our definition of Heavy Smoking is 
now loosened to include men with lifetime aver- 
ages of a pack a day, so as to include a workable 
number of cases.) What is more, coarctated peo- 
ple tend (.01) to increase the amount of their 
smoking strikingly as the years go by. This is 
not a universal trend; many of our men reach a 
smoking plateau early or vacillate between non- 
smoking and moderate smoking. The coarctated 
person is common among that plurality of our 
people whose smoking curve “snowballs,” show- 
ing a positive acceleration. The appearance of 
these curves suggests to the eye that they be inter- 
preted as showing accelerating habituation. Inter- 
estingly, such curves often appear among people 
who started smoking late. (They are not just 
“catching up,” however; they soon surpass the 
smoking rates of men who started earlier.) 2 

This association between accelerating smoking 
rates and coarctation is strong enough to override 
the other trends so far reported. A coarctated man 
tends to become a heavy smoker no matter what 
his social background or how pious his family— 
or himself. His orientation may show in his being 
a nonsmoker in college if he came from a non- 
smoking type of background. Soon after college, 
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he typically begins to smoke, and the habit accel- 
erates. One is reminded of Bales’ (2) comment 
that, “there is reason to believe that if the inner 
tensions are sufficiently acute, certain individuals 
will become compulsively habituated in spite of 
opposed social attitudes.” 

Are our coarctated people that badly off? There 
is no very detailed Rorschach theory (9, 27) from 
which one might interpret their tests. They do 
seem to indicate a certain lack of emotional re- 
sources—or unwillingness to use them. Under- 
standably, these men were rated by the psychi- 
atrists (26) as inarticulate, pragmatic, and bland. 
They were not necessarily thought to have inner 
tensions. They were, at any rate, quite hard to 
get to know. Many were rated “Just-So” by the 
psychiatrist; that is, they were compulsive fiddlers, 
desk-arrangers, people who allay their tensions 
through fussy activities, of which smoking could 
understandably become one. This very striking 
finding about coarctation remains hard to explain. 
Nor have we found the means to cross-validate it. 
Since the college Rorschach pattern predicts life- 
time smoking, another long-term study would be 
needed for ideal cross-validation. 

Not all heavy smokers are coarctated, of course. 
Among the noncoartated subjects, there is a tend- 
ency for the associations between amount of 
smoking and some of the psychosocial variables 
previously discussed to be particularly “clean.” 
It is as though coarctation washes out the array 
of emotional variations between persons as well 
as, perhaps, the variability of the smoking rate. 
Among these noncoarctated subjects, it does ap- 
pear that the psychosocial variables have more to 
do with the difference between nonsmoking and 
moderate smoking (as discussed above) than with 
heavy smoking. 

In the noncoarctated group, moderate smoking 
and heavy smoking are respectively related (.001) 
to the psychiatrist’s rating of Strong Basic Person- 
ality and Weak Basic Personality. One presumes 
that the latter rating is appropriate to the rather 
uncontrolled men who were described as being 
among our very heaviest smokers. It is not true 
that everyone who smoked more than a pack a 
day over his adult life was poorly integrated. It 
is true that poorly integrated people were much 
more commonly found among these heavier smok- 
ers. Again, one sees a pattern such as Bales 
would predict: orientation leading to moderate 
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smoking but “some need or complex of needs for 
adjustment” leading to excessive habituation. 

In summary, then, we may hypothesize that 
starting to smoke is largely brought about by one’s 
social environment but that reactions to smoking, 
once it has started, seem to depend in good part 
on the personal needs that the newly-established 
habit is able to gratify. Some people seize on the 
habit compulsively. These people may often be 
emotionally constricted types for whom there is 
great gain in a simple “flight into behavior” or 
they may be restless, active men, for whom smok- 
ing is just one more impulsive activity. It would 
also seem that anxious people can seize on smok- 
ing as a tension reducer if they have already, for 
other reasons, been oriented toward it. In short, 
the habit, once well available, increases in strength 
if it serves well the person’s emotional economy. 


WHO CAN STOP SMOKING? 


As every habitual smoker knows to his sorrow, 
ability to quit or cut down decreases as smoking 
increases. If we divide our lighter from our 
heavier smokers at an adult lifetime average of 
half a pack a day, the contingency relation be- 
tween “lighter” and “heavier” vs. “can stop” and 
“can't stop,” as shown in recent responses to ques- 
tionnaires, gives a p less than .001. Even more 
striking is the march (.01) of the mean numbers 
of cigarettes smoked per day during adult life. 
For men who can stop smoking this figure (com- 
puted only during their smoking years) is 9, for 
men who don’t try to stop it is 18, and for those 
who cannot stop it is 20. The sheer amount of 
tobacco so far consumed is by far the largest dif- 
ference between the group of men who can stop 
and the group who, at least so far, cannot. 

The variables that were related to becoming 
a heavy smoker seem also to bear some relation 
to ability to stop. Thus, coarctated Rorschachs 
were frequent among people who became heavy 
smokers but it is also true that, with heavy smok- 
ing held constant, the presence of a coarctated 
Rorschach seems related to inability to cut down 
or stop. Some of our scientists have smoked; they 
could easily stop. (They were not heavy smok- 
ers, of course.) Perhaps the scientists show that 
it is easier to quit when one’s compeers do not 
smoke: none of our five writers can quit. Among 
our heavier smokers, the psychiatric labels of 
Sociable, Strong Basic Personality, and Practical 
characterize mostly people who can stop; Weak 
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Basic Personality, Asocial, Lack of Purpose and 
Values, Introspective, Ideational, and Inhibited 
characterize mostly people who cannot. These 
are all small trends, however, and mostly occur 
in small numbers of cases. Yet these relation- 
ships seem worth mention because of their con- 
gruence with general theory. The variables re- 
lated to smoking compulsively are mostly “need” 
variables, as the Bales theory would require. 

A signal fact is this: ability to stop smoking is 
directly proportional to the number of months 
our subjects were fed from their mothers’ breast! 
The means march (.05) as follows: light smokers 
who can stop were weaned at 8.0 months; heavy 
smokers who can stop were weaned at 6.8 months; 
smokers, mostly heavy, who don’t try to stop were 
weaned at 5.0 months; smokers, mostly heavy, 
who try to stop but cannot were weaned at 4.7 
months. We had previously explored for a pos- 
sible relationship between smoking and weaning 
or amount smoked and weaning, but it was not 
until we explored ability to stop smoking that the 
relationship to breastfeeding became so cleancut. 

Many will wish to explain such a finding away. 
Certainly we would not argue that weaning is the 
cause of smoking. The argument must proceed 
through some tertium quid. In this we would 
agree with Linton (72), who stresses that such 
a crude datum as date of weaning or any other 
data “focussed primarily on actual technical op- 
erations, without a correspondingly detailed study 
of the maternal attitudes which accompanied these 
performances,” is naive. The events of infant 
training are part of a broader pattern and sympto- 
matic of it. “We must remember that these influ- 
ences operate on the child from birth, and con- 
tinue to operate on him for a long period of 
time. . . .” The congruence of infant training, 
childhood rearing, and adult values is shown by 
McClelland (15) for two contemporary American 
subcultures. So, in our data, late weaning was 
associated with those personality traits that are 
also related to ability to stop smoking. We do 
not have to postulate that infantile frustration and 
adult cigarette smoking are unmediated cause and 
effect. 

Yet, it is a commonplace that people who stop 
smoking seek other oral gratifications. All the 
“small vices” we have correlated with smoking 
are oral vices: alcohol, coffee, and tobacco are 
taken in through the mouth. So is sugar, which 
Heath shows to be related to smoking. More 
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directly to the point: Brozek (4) has experimen- 
tally documented the notion that men who give up 
smoking gain weight. 

Theoretical reasons suggest that smoking should 
be correlated to psychoanalytic “orality.” Our 
empirical data suggests that it is. Groups with 
the ability to stop smoking contain a smaller 
percentage of bottle babies. Thumbsucking was 
more commonly reported (close to .05) for men 
who continued to smoke. If one does not think 
of a psychoanalytic explanation, then a drive- 
reduction theory like that of Levy (11) seems to 
fit the data. Nor would these ideas be incon- 
sistent with the Bales model: what they suggest 
is that these “deeper” needs have little or no 
effect on whether one smokes but great effect on 
how tenacious the habit, once adopted, may be- 
come. 

In summary, the ability to stop smoking is 
grossly related to the amount of tobacco one has 
consumed. Good mental health, such that one 
has control over one’s habits in general, seems 
to be relevant. So, apparently, is oral gratifica- 
tion received as an infant. This “orality” factor 
may be mediated in many ways, but its meaning- 
fulness should not be overlooked. 


THE RELIABILITY OF THE FINDINGS 


To check the reliability of some of these find- 
ings, we drew five per cent random samples of the 
class of 1958 and the class of 1961 from the files 
of the University Health Services (Ns were 55 
and 58). Every freshman fills in a medical ques- 
tionnaire that contains an item about smoking as 
well as several items relevant to the variables re- 
lated to smoking in the Study of Adult Develop- 
ment. Clearly, only the criterion of smoking in 
college (or near the start of college) was available. 
However, insofar as these checks bore out our 
findings, they showed them to be reproducible 
over a period (1939 to 1957) of almost twenty 
years, : 

Both samples showed an excess (.01) of public- 
school nonsmokers. In the first sample, the only 
one for which income data was available, it was 
noted again that income was not as good a pre- 
dictor of smoking behavior as was type of pre- 
paratory school. 

The relationship between seriousness of purpose 
and nonsmoking may be adumbrated by one 
datum. Within the public-school group, there is 
in both samples an excess of nonsmokers among 
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those boys who characterize as Definite their pre- 
college choice of career. Similarly, in the sample 
for which income data is available, there is a 
strong tendency for the boys who are both Protes- 
tant and poor not to smoke. These would seem 
to be likely to be the lads for whom Harvard 
meant hard work and mobility. Or, viewed an- 
other way, these would seem likely to include 
most of our members of Fundamentalist denomi- 
nations. 

In both freshmen samples, those boys who an- 
nounce their intention of going into engineering 
or science tend not to smoke. 

The examining physician rated these freshmen 
on “Personality Integration.” For both samples, 
the percentage of smokers marches upwards as 
the ratings become less favorable. For the com- 
bined samples, the percentages of smokers are: 
for men rated “A,” O per cent; for men rated 
“B,” 23 per cent; for men rated “C,” 37 per cent; 
and for men rated “D,” 50 per cent. There were 
few A’s and few D’s. The contingency table 
made by combining A and B ratings in one 
column and combining C and D ratings in the 
other and setting these ratings against smoking 
and nonsmoking gives a chi square of 3.6, so p 
is not quite as low as .05. 

A similar rating by the examining physician 
was a prediction of College Adjustment. The 
percentages again march (though most clearly for 
the combined samples) as follows: among those 
rated “A,” there were 13 per cent smokers; among 
the “Bs,” 23 per cent; and among both “Cs” and 
“Ds,” there were 50 per cent. Combining A and 
B and C and D, one gets a contingency table 
from which chi square is 8.0 and p is less than 
.01. Both these findings from the physicians’ 
ratings would seem to parallel the study findings 
that smoking went with psychiatric ratings sug- 
gesting poorer mental health. 

The study finding that smoking and drinking 
are correlated was amply confirmed for the class 
of ’58 but not so strongly shown in ’61. In both 
samples, the tendency for drinking to precede 
smoking was noted. 

On the whole, then, where we could cross-check 
the study findings, they seemed to hold up well. 
In this connection, it may be worth pointing out 
that a few of the patterns reported from the Har- 
vard data have been found elsewhere. An un- 
published survey of Yale seniors shows the public- 
private differences in smoking habits still pro- 
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nounced at the end of the college career. The 
priority of drinking to smoking is noticeable. 
(This was, of course, first noted at Yale by Straus 
and Bacon.) The introjection of family standards 
shows very clearly as a source of smoking mores. 
We have already cited notable corroboration from 
market research done in England, where the as- 
sociation of nonsmoking with white-collar, middle- 
class identity was amply confirmed, as well as the 
definition of smoking as one of the minor vices. 
The same pattern is suggested by a national sam- 
ple questioned about their smoking habits by the 
U. S. Bureau of Census (5). Results already 
cited from the University of Minnesota are con- 
sistent with our findings. 

The one area in which no cross-validation has 
been possible has been that of psychodynamics, 
especially the findings from the Timed Rorschach. 
Since these personality patterns are very relevant 
to a theory of smoking, it is to be hoped that 
something may be done to check them. 


SUMMARY 


A large part of what we have learned about 
the correlates of smoking habits may be made to 
fit a conceptual model like that of Bales. The 
fact that a man smokes or does not seems to be 
determined by whether or not he has been oriented 
to the habit as a result of his social milieu. 
Whether he becomes a heavy smoker or is unable 
to stop smoking seems determined by the useful- 
ness of the smoking habit to his personal needs. 

Nonsmokers tend to be lower-middle class in 
origin, upwardly mobile, earnest young men, bred 
in a work morality that is conducive to Inner 
Direction. Their parents and they themselves are 
often pious. They may pursue scientific or tech- 
nical careers in many instances. Smokers are (in 
our data) likely to come from more privileged 
backgrounds, often entering business or human- 
istic careers, often having been raised in a Being 
or Being-in-Becoming orientation. Both subcul- 
ture and the family as the mediator of the sub- 
culture are important determinants of whether and 
when the young man is oriented to the smoking 
habit. ; 

Whether a smoker becomes a heavy smoker 
seems to depend on whether the habit serves many 
of his important needs. Very anxious or agitated 
men may adopt smoking as a tension reducer, but 
this use of smoking seems to be common only 
when the habit is already well established by other 
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circumstances. Emotionally constricted individu- 
als seem to “take to” smoking with special eager- 
ness. , 

Whether a man can alter his smoking habits 
seems to be most “deeply” determined. Efforts 
to quit or cut down seem to be normal, since 
smoking is quite widely seen as a “small vice,” 
even by smokers. Whether a man can succeed in 
these efforts is first of all a function of how much 
tobacco he has consumed in his lifetime. Certain 
personality variables are also relevant. In gen- 
eral, it is the needs that lead to heavy smoking 
after the habit has begun that also militate against 
quitting or cutting down. The social variables 
that were related to starting to smoke play little 
role in inability to quit. The effects of oral grati- 
fication at the breast may be relevant to control 
excessive smoking, once a great deal of tobacco 
has been consumed. 

Where data were available, the study findings 
were cross-validated against samples from two 
Harvard classes, with generally encouraging re- 
sults. Some confirmation of other findings has 
been had from unpublished studies at distant 
points and from a nationwide study by the Census 
Bureau. 

Not all that we know about smoking fits into 
the proferred conceptual scheme. However, the 
scheme outlined makes our major findings hang 
together and may serve as a first hypothesis in 
further exploration of the psychology of smoking 
habits. 
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195 
MINORITY GROUP BELONGING, 

SOCIAL PREFERENCE, AND 
THE MARGINAL PERSONALITY * 


D. W. Lewit 


Since Lewin (5) published his ideas on “self- 
hatred” in ethnically marginal persons, a number 
of relevant studies have appeared which have sup- 
ported most of his contentions. Briefly, Lewin 
asserted that marginal men are frustrated by anti- 
minority prejudice in their continued efforts to 
become accepted by the privileged (dominant) 
minority or the majority, and remain uncertain 
of their group belonging. One consequence is a 
continuing state of high tension which in Lewin’s 
system implies a generalized tendency to aggres- 
sion, development of a negative attitude toward 
the perceived source of frustration, and/or de- 
differentiation of relevant aspects of the perceived 
environment. In the case of ethnically marginal 
persons, “self-hatred” refers to hostility toward 
the minority group as the perceived source of 
frustration, displaced from anti-minority members 
of the dominant group. The personal aspect of 
“self-hatred” is the assumed “unhappiness” (5, p. 
164) or personal insecurity theoretically associ- 
ated with chronic marginal status, 

The assumption that such marginal persons re- 
main uncertain of their group belonging is sup- 
ported by data from Steckler (9) and Radke- 
Yarrow and Lande (7). Steckler reports a posi- 
tive correlation between anti-white and anti-Negro 
attitude scores in a sample of Negro students. 
Radke-Yarrow and Lande found a positive corre- 
lation between anti-Semitism and Jewish ethno- 
centrism scores in a Jewish sample. Glaser (1), 
in his review of literature on ethnic identification, 
states that dominant group bigots and minority 
group chauvinists prefer to associate with mem- 
bers of their own group, while marginal persons 
are inconsistent in their social preferences. Sar- 
noff (8) tested the hypothesis that anti-Semitic 
Jews, relative to non-anti-Semitic Jews, would 
show more personal characteristics similar to those 
of authoritarian “aggressors” with whom they pre- 
sumably identified. Instead of showing uncritical 


* Reprinted by permission from The Journal of 
Abnormal and Social Psychology, November, 1959, 
Vol, 59, No. 3, 357-362. 
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glorification of parents, however, they tended 
(more than non-anti-Semitic Ss) to fear and have 
negative attitudes toward parents. Also, rather 
than direct retaliation, the anti-Semitic Jews 
tended to show a relatively passive or self-hostile 
reaction to aggression directed against them from 
out-group members. Projective test data from the 
Radke-Yarrow and Lande sample were consistent 
with Sarnoff’s findings. In general, these results 
are consistent with Lewin’s assumptions. 

It remains to be shown, however, whether mar- 
ginal members of ethnic minorities necessarily or 
usually orient themselves positively toward the 
dominant or most privileged ethnic group. If an 
ethnic group less threatening and more accom- 
modating than the dominant one is also available 
for social intercourse, it is reasonable to suppose 
that marginal members of an ethnic minority may, 
orient themselves preferentially toward it. One 
purpose of the present study is to check this as- 
sumption. Another purpose is to determine 
whether marginal status necessarily implies per- 
sonal insecurity, personal indiscriminateness (low 
cognitive differentiation), or socially inconsistent 
standards of perceptual judgment. 

To these ends a social preference technique in- 
volving the expression of preferences for repre- 
sentatives of various ethnic groups in a familiar 
transient public situation was employed to measure 
orientation (operationally referred to hereafter as 
“social preference”), intra-group agreement, and 
intra-individual consistency. The Knutson (4) 
Personal Security Inventory was used to provide 
a measure of insecurity close in meaning to 
Lewin’s “unhappiness.” Marginal Jews, nonmar- 
ginal Jews, and Protestant North Europeans were 
used as Ss to make preferential choices among 
North Europeans, Mediterraneans, Jews, and Ne- 
groes. The following hypotheses, stated so as to 
be consistent with Lewin’s formulations, were 
tested: 

s 1. Marginal Jews show social preferences more 
like Presbyterians than do nonmarginal Jews. 

2. Marginal Jews show greater personal inse- 
curity than either nonmarginal group. 

3. Marginal Jews show less agreement with one 
another in their social preferences than either non- 
marginal group. 

4. Marginal Jews as individuals show less con- 
sistency in their social preferences than members 
of either nonmarginal group. 


METHOD 


Subjects —Data were collected at Stanford Uni- 
versity in the spring of 1951. Ss were 40 mar- 
ginal Jews, 40 nonmarginal Jews, and 40 Presby- 
terians of North European background. All were 
male students ranging in age from 17 to 37. All 
of the Presbyterian Ss indicated that both parents 
were Protestant and had English, Scotch, German, 
Dutch, or Scandinavian surnames. 

In assembling the two Jewish groups, an effort 
was made to get in touch with all male students 
of Jewish background at the University, including 
agnostics. A questionnaire, entitled “Participa- 
tion in Jewish Affairs,” was used to select the 
marginal and nonmarginal Jewish samples, and 
was completed by 159 of these students. It con- 
tained 24 multiple-choice items covering syna- 
gogue and church membership and participation, 
religious education, religious festivals, Bible and 
prayer, dating and marriage, fraternities and re- 
sorts, food habits, charities, reading and radio 
listening, and general social contact. Each of the 
response alternatives had been weighted on a 7- 
point basis from —3 for extreme deviation from 
conventional Jewish norms to +3 for extreme 
adherence to such norms. Weights were assigned 
independently by 10 sophisticated Jewish men and 
women. For 82% of the 95 response alternatives 
the modal value contained a majority of the judg- 
ments. The average of the judged weights for 
any alternative, to the nearest integer, was assigned 
to that alternative. Thus to the question “Have 
you ever deliberately avoided social contacts with 
Jews?” a weight of —3 was assigned to the alter- 
native Yes, and 0 to the alternative No. To the 
question “Thinking of the girls you have dated 
at college, what proportion would you say were 
Jewish?” a weight of +3 was assigned to the 
alternative “all have been Jewish,” and —2 to the 
alternative “none have been Jewish.” The range 
of possible scores for the whole scale was —27 
to +34. 

The split-half reliability of the scale was .85 
by the Spearman-Brown formula. Some indica- 
tion of its validity may be gained by noting that 
18 out of 19 men selected by two rabbis (Ortho- 
dox and Conservative) as highly conventional in 
Jewish religious practice scored higher on a pre- 
liminary form than 92 out of 96 nominally Jewish 
members of the progressive American Veterans 
Committee in the same region. 

For the present study the respondents scoring 
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in the lowest quartile, with scores from —12 to 
—1, were retained as the marginal Jewish group. 
Those scoring in the upper quartile, with scores 
ranging from +11 to +29, were designated the 
nonmarginal Jewish group. 

Procedure.—By means of paired comparisons, 
Ss in each of the three groups were tested indi- 
vidually for social preferences, intra-group agree- 
ment, and intra-individual consistency with respect 
to 16 photographs. The photographs comprised 
four North European men, four Mediterraneans, 
four Jews, and four Negroes, with one profes- 
sional, one white-collar worker, one skilled 
worker, and one unskilled worker in each ethnic 
set. Each of the full-face, black-and-white photo- 
graphs was taken under standard flash lighting 
conditions, enlarged to standard size (head length 
about 5 cm.), cut away from its background, and 
mounted on a heavy buff 5X 7 in. cardboard. 
These men were selected by going to places of 
work where members of the specific ethnic groups 
were likely to be found. Surnames and person- 
ally acknowledged identity were checked with the 
desired ethnic background in all cases except 
Negroes, where such a check was unnecessary. 
Their ages were between 20 and 35. Their attire, 
showing to the middle of the chest, was nearly 
uniform within each occupational level. 

Photographs were presented as a series of 120 
pairs. Each S was instructed to think of himself 
in a cafeteria near where he lives, and to choose 
a seat at a table for two, opposite either of the 
two men represented in the photographs that E 
had placed on the easel. It was indicated that no 
other seats were available in the cafeteria at the 
time, and that each of the two available places 
was equally far from S, who cannot tarry too long 
with his loaded tray. The home-town cafeteria 
situation was chosen because the face-to-face ar- 
rangement suggests more than an impersonal 
relationship, and the possibility exists of being 
observed by a friend or a family associate. Before 
each pair was presented, S was briefly reminded 
of this situation. 

Each pair was presented once, in a fixed ran- 
dom order, with individual photographs counter- 
balanced for left-right position. S indicated which 
man he chose to sit with by pointing. Most Ss 
testified that the imagined context of the choices 
seemed real, They seemed to anticipate each 
choice with interest, were observed to fixate each 
picture at least once, made most of their choices 


within five seconds after presentation, and did not 
tire despite the long series of presentations. 

At the conclusion of the paired comparisons, 
each $ filled out a modified short form of the 
Knutson Personal Security Inventory, dealing with 
satisfaction with status, work, personal recogni- 
tion, health, and general happiness. The seven- 
item short form shows a .85 correlation with the 
long form, whose validity is reported by Knut- 
son (4). 


RESULTS 


Social Preferences—Relative preferences for 
North Europeans, Mediterraneans, Jews, or Ne- 
groes within the ranges of stimulus difference 
represented in the photographs could be deter- 
mined by summing the choices of photographs in 
each of these ethnic categories. The majority of 
such choices would involve not only ethnic dif- 
ferences, but also occupational differences. In 
order to eliminate confounding ethnic with occu- 
pational considerations, only the 24 intra-occupa- 
tional comparisons are considered, with ethnic 
preference scores obtained by summing choices 
of men in each of the four ethnic groups within 
these comparisons. In the same manner, occu- 
pational preference scores may be derived from 
the 24 intra-ethnic comparisons. Analysis of 
variance showed no significant differences among 
marginal Jews, nonmarginal Jews, and Presbyte- 
rians in the distribution of preferences among the 
four occupational levels. 

Let us refer to the array of S groups involving 
marginal Jews, nonmarginal Jews, and Presbyte- 
rians as the “belonging” variable. Similarly, let 
us refer to the dimension involving North Euro- 
pean, Mediterranean, Jewish, and Negro photo- 
graphs as the “ethnic objects” variable. Analysis 
of variance shows that variance due’ to ethnic 
objects is sufficiently greater than the remainder 
variance (after removal from the total variance 
of variance due to ethnic objects and the ethnic 
objects X belonging interaction) so as to be sig- 
nificant at the .01 level The over-all test for 
hypothesized differences is the interaction between 
belonging and ethnic objects, which proved to be 
significant at the .01 level. In a separate analysis 
of variance involving only the preferences of mar- 
ginal and nonmarginal Jewish Ss on Jews and 


1The paired comparisons technique fixes “between 
Ss” variance including variance due to belonging at 
zero. 
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Mediterraneans, the belonging X ethnic objects 
interaction proved to be significant at about the 
.02 level.? 

Fig. 1 shows means for each belonging class 
on each of the ethnic objects classes. Relative to 
the nonmarginal Jews, Presbyterians preferred 
North Europeans and Mediterraneans, and 
avoided Jews and Negroes. Marginal Jews tended 
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Fic. 1. Mean choices of North European, Medi- 

terranean, Jewish, and Negro men by marginal 

Jews, ‚nonmarginal Jews, and Presbyterians in 24 
intraoccupational paired comparisons. 


to agree with nonmarginal Jews in avoiding North 
Europeans and favoring Negroes, but showed the 
same pattern of preferences as the Presbyterians 
with respect to Mediterraneans and Jews. That 
is, relative to nonmarginal Jews, marginal Jewish 
Ss tended to favor Mediterraneans and to avoid 
Jews. Each of these differences with respect to 
Mediterraneans and Jews is significant at the .05 
level, the Presbyterian-nonmarginal Jewish dif- 
ference with regard to Jewish ethnic objects being 
significant at the .01 level, by t test. Each of the 


2 Since only half of the intra-occupational compari- 
sons are involved here, the “between Ss” variance is 
not fixed at zero. This variance was removed from 
the total variance in figuring the “within Ss” variance 
including the remainder variance, 
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differences with respect to North Europeans and 
Negroes is significant at the .01 level by 1 test. 

Personal Security —No significant differences 
in personal security mean scores were found be- 
tween groups of Ss. Mean scores for nonmar- 
ginal Jews, marginal Jews, and Presbyterians were 
7.35, 6.88, and 7.50 respectively in a possible 
range of 0 to 10. Corresponding SDs were 1.88, 
1.54, and 1.66. 

Among marginal Jewish Ss a significant corre- 
lation ratio was found relating personal security 
and excess of choices of Mediterraneans over 
North Europeans (eta = 0.56, P < .05 for epsilon- 
square). Except for the extreme involving nearly 
total rejection of North European social objects 
relative to Mediterraneans, higher personal secu- 
rity tended to be associated with preference for 
Mediterraneans. Neither nonmarginal Jews nor 
Presbyterians showed any such relationship in cor- 
responding analyses. 

Intra-group Agreement.—Kendall’s (2) u gives 
a measure of agreement among judges making a 
given set of paired comparisons. This statistic is 
based on the proportion of judges for any com- 
parison AB making the judgment “A preferred to 
B” as against the proportion saying “B preferred 
to A,” and considers all comparisons. Consider- 
ing the average of intra-occupational u’s for each 
of the three groups of Ss, over-all u for nonmar- 
ginal Jews was .10, for marginal Jews was .11, 
and for Presbyterians was .22. Using a related 
Statistic given by Kendall which is distributed as 
chi square, each of these three u’s proves to be 
significantly aifferent from zero at the .001 level. 
Converting the three chi squares to zs, the dif- 
ference between coefficients of agreement for 
marginal Jews and nonmarginal Jews proved to 
be not significant, while the difference between 
each of these groups and the Presbyterians was 
significant at the .001 level. 

Intra-individual Consistency—Among the six 
paired comparisons at any occupational level, so- 
cial preferences indicated by any S may be com- 
pletely consistent (transitive) or may involve 
some inconsistency or intransitivity. For example, 
if we label the four professional men A, B, C, 
and D, the judgments “A preferred to B,” “B 
preferred to C,” and “C preferred to A” are 
intransitive and constitute a circular triad. The 
number of circular triads of photograph pairs was 
determined for each S, adding together the num- 
ber of circular triads from each occupational level. 
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No circular triads at all represents perfect con- 
sistency for an S. 

Twenty-eight Presbyterians, 23 marginal Jews, 
and 18 nonmarginal Jews showed perfect consist- 
ency, the remainder showing between one and five 
circular triads out of a possible eight. The fre- 
quency of perfect consistency among Presbyterians 
was significantly greater than that for the non- 
marginal Jews, at the .02 level by chi square. The 
intermediate frequency of the marginal Jews was 
not significantly different from either of the other 
groups. 


DISCUSSION 


The first hypothesis given in the introduction 
stated that marginal Jews would show social pref- 
erences more like Presbyterians than would non- 
marginal Jews. This hypothesis is clearly sup- 
ported by the data. But the patterns of prefer- 
ences of marginal Jews and members of the domi- 
nant group are by no means identical. While 
nonmarginal Jews tend to orient themselves to- 
ward Jews, and Presbyterians toward both North 
Europeans and Mediterraneans, marginal Jews 
tend to orient themselves toward Mediterraneans 
—but not toward North Europeans. In other 
words, the data suggest that marginal Jews may 
identify themselves (or prefer to associate or be 
identified) with a group which is not the dominant 
and perhaps unaccepting ethnic group, but which 
is just as acceptable to members of the dominant 
group as the dominant group itself. Since this 
“compromise” group is on the whole physiognomi- 
cally more similar to the Jewish minority than the 
dominant group, visual “passability” may also be 
a determining factor in orientation. 

The second hypothesis predicted that marginal 
Jews would show greater personal insecurity than 
either of the nonmarginal groups. The data fail 
to support this hypothesis. The recent research of 
Kerckhoff and McCormick (3) and Mann (6) 
suggests that this finding is to be expected if mar- 
ginal Jews do not orient themselves toward the 
dominant North European group. In the Kerck- 
hoff and McCormick study of 54 Chippewa Indian 
children, high insecurity scores were found only 
among those Ss who were sufficiently Indian-look- 
ing to be rejected by whites and who identified 
with white culture. If only one or if neither of 
these conditions applied, personal insecurity was 
lower, i.e., about the same as of white school- 
mates. 
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With adult members of the South African Col- 
ored (mulatto) population, Mann’s results were 
similar. He studied 25 high passable (near 
white) men and women and a matched group of 
25 who were low passable. In addition to meas- 
ures of orientation toward whites and of “mar- 
ginal personality” traits (insecurity, self-pity, and 
sensitivity), Mann obtained a measure of “bar- 
rier acceptance”—acceptance of restrictions ap- 
plied by whites to Colored-white interaction. He 
found no correlation between passability and the 
personality scores. But he found two sets of con- 
ditions under which personal insecurity (and as- 
sociated personality measures combined into one 
score) was high: (a) low passability, low barrier 
acceptance, and a high “undecided” score on the 
white-Colored orientation measure; and (b) high 
passability, low barrier acceptance, and pro-white 
orientation. 

Knowing that the marginal Jews in the present 
study tended not to orient themselves toward the 
dominant North Europeans, it may be expected 
in the light of the two studies just reviewed that 
they would not on the average show greater in- 
security than the other groups of Ss. The rela- 
tionship between security and North European vs. 
Mediterranean orientation, showing greatest per- 
sonal security for marginal Jews who moderately 
favor Mediterraneans and progressively less secur- 
ity as a preference for North Europeans increases, 
provides supporting evidence for the orientation- 
insecurity hypothesis, with orientation as a contin- 
uous variable. It is probable that Sarnoff’s (8) 
anti-Semitic Jews were so strongly oriented to- 
ward North Europeans as to “identify” with them 
and reject themselves as Jews by anti-Semitic cri- 
teria. Some marginal Jews in the present study 
might possibly be so characterized, but evidently 
most of them cannot: it is apparent that negative 
orientation toward the Jewish minority does not 
necessarily imply positive orientation toward the 
dominant North European group. It is reasonable 
to suppose that marginal Jews see the North Eu- 
ropean group as threatening or at least unaccept- 
ing, while the Mediterranean group is seen as 
more accepting and otherwise more permeable. 

Results with respect to the third and fourth 
hypotheses are not surprising in view of the find- 
ings already discussed. Contrary to the deduction 
from Lewin’s theory, marginal Jews are no less 
agreed about in-group and out-group preferences 
than nonmarginal Jews, even though Presbyterians 
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are more in agreement. It is reasonable to sup- 
pose that identification with some non-Jewish 
group— whether ethnically defined, as suggested 
by the preference for Mediterraneans, or non- 
ethnically defined—could maintain agreement 
among marginal Jews at least as great as among 
nonmarginal Jews. It might alternatively be ar- 
gued that Jews, whether marginal or nonmarginal, 
are generally less prejudiced toward out-groups 
than members of the dominant ethnic group, or 
at least lack uniform ethnic standards of social 
judgment. The only data that might be advanced 
from this study in support of this argument is 
that the social preference curves of the two Jew- 
ish groups (Fig. 1) were flatter than that of the 
Presbyterian group. 

The fourth hypothesis dealt with intra-indi- 
vidual inconsistency in preferences to be expected 
from assumed tension states. The data fail to 
show that marginal Jews are any more incon- 
sistent than either of the nonmarginal groups. 
These results agree with the finding that margi- 
nal Jews do not differ from either of the other 
groups of Ss in personal insecurity (tension). 
The significantly greater inconsistency shown by 
individuals in the nonmarginal Jewish group com- 
pared to the Presbyterian group, however, is curi- 
ous. It is possible that nonmarginal Jews were 
less interested than Presbyterians in the social ob- 
jects they were choosing among, but there is no 
evidence for this. It is possible that nonmarginal 
Jews are less rigid in their standards of interper- 
sonal preference than Presbyterians, shifting stand- 
ards from one pair of faces to another, It is also 
possible that the social environment of nonmar- 
ginal Jews may be somewhat less differentiated to 
begin with, relative to the other two groups under 
consideration, and that recall of recently expressed 
preferences is therefore less exact than among 
the other groups. No relevant evidence can be 
offered, however, at this time. 

In general, it may be concluded that marginal 
Jews do not necessarily orient themselves toward 
the dominant ethnic group of the society, and 
that tension derived from unwilling nominal mem- 
bership in an underprivileged group may be re- 
duced by identifying or being associated with a 
nonthreatening alternative group. It may prove 
valuable in the future to investigate availability 
of new groups or their norms to marginal individ- 
uals as a factor in marginal personal dynamics dis- 
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tinct from barriers of prejudice which inhibit so- 
cial mobility. 


SUMMARY 


Forty marginal Jews, 40 nonmarginal Jews, and 
40 Presbyterians of North European background 
individually made paired comparisons of unla- 
beled photographs of 16 men of four ethnic 
groups. Selection of one photograph of any pair 
meant preference for that man as a luncheon 
partner. Presbyterians as a group preferred North 
Europeans and Mediterraneans, nonmarginal Jews 
preferred Jews, and marginal Jews preferred Medi- 
terraneans. Nonmarginal Jews showed less agree- 
ment in their choices than Presbyterians, and mar- 
ginal Jews showed an intermediate level of agree- 
ment. Both marginal and nonmarginal Jews 
showed less intra-individual consistency in their 
choices than Presbyterians, but marginals did not 
show less consistency than nonmarginal Jews. A 
measure of personal security failed to show dif- 
ferences between the three groups. However, 
marginal Jews who preferred North Europeans to 
Mediterraneans tended to be less secure than those 
who moderately favored Mediterraneans. 

The results suggest that marginal Jews tend to 
identify themselves with non-Jewish groups which 
do not reject them, and consequently do not main- 
tain the tension which is associated with identify- 
ing with an aggressor. 
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EMOTIONAL ATTITUDES OF 
FORMER SOVIET CITIZENS, AS 
STUDIED BY THE TECHNIQUE 

OF PROJECTIVE QUESTIONS * 


HELEN BEIER AND EUGENIA HANFMANN + 


In studying the personality patterns of former 
Soviet citizens we used, among other tests, the 
projective questions that had been employed in 
the California studies of authoritarianism (J). 
These questions were included mainly because, by 
inquiring about sentiments and cathexes, they elicit 
expressions of significant attitudes and provide 
rich material for personality study. Furthermore, 
since this test had been calibrated as a measure 
of authoritarianism, we planned to compare the 
Russian and the American performance in terms 
of frequency of responses indicating the authori- 
tarian and the equalitarian personality patterns. 
Because of this double utilization of the data, all 
test responses were subjected to two separate pro- 
cedures: they were scored as high, medium, or 
low in authoritarianism,? and, in an independent 
operation, they were classified according to their 
content, The selection of categories for this lat- 
ter ordering was determined partly by the mate- 
rial itself, partly by our attempting to group the 


* Reprinted by permission from The Journal of 
Abnormal and Social Psychology, September, 1956, 
Vol. 53, No. 2, 143-153. 

1 The authors are grateful to the Russian Research 
Center of Harvard University which sponsored this 
study under a grant from the Human Resources Re- 
search Institute of the U. S. Air Force. Drs. 
A. Inkeles and D. J. Levinson read the manuscript 
and made many helpful suggestions. 

2 The scoring in terms of authoritarianism was done 
by J. Orton and H. Beier with the advice and assist- 
ance of D. J. Levinson and Miss L. Heims. 
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responses in ways that would permit a variety of 
psychological analyses. The content categories 
were different for each question and were not 
intended to reproduce the variables by which the 
California scoring is guided. 

In the present communication we shall report 
only the results of the content analysis of the Rus- 
sian and American responses, with the purpose 
of contributing to the description of the emotional 
and evaluative attitudes of the Russian subjects. 
The discussion of the scores of authoritarianism 
as such must be reserved for a later publication: 
it could not be presented without a detailed de- 
scription of the various alternate ways in which 
we tried to meet the difficulties involved in apply- 
ing a scoring system cross-culturally. It should 
be mentioned, however, that none of the variations 
of the scoring procedure we used yielded any 
large differences in the average scores of authori- 
tarianism of the Russian and American samples. 
In the light of this finding, the differences in the 
actual content of the responses of the two groups 
emphasize the fact that both “high” and “low” 
scores can be given to responses indicating very 
different kinds of experience and behavior. Thus 
our results have some bearing on the general 
problem of subpatterns of authoritarianism and 
equalitarianism, a topic which will be taken up 
in the summarizing discussion of the findings. 


METHOD 


The Russian sample included 39 men and 9 
women, almost all of them Great Russians. They 
were selected for a clinical study more or less at 
random from a much larger number of former 
Soviet citizens who were interviewed in Munich 
in the winter of 1950-51 by the Harvard Project 
on the Soviet Social System. Of this group of 48 
subjects, 30 were brought to Germany during the 
war: either as war prisoners, or as civilians im- 
ported by the Germans to increase their labor 
force. Their reason for not returning to their 
native country after the war was fear of the sus- 
picious and punitive attitude of the regime toward 
those who had spent a long period of time outside 
the Soviet Union. The remaining 18 subjects 
(all of them men) found their way to Germany 
after the war, most of them escaping from the 
Russian Army of Occupation; some of them were 
fairly recent arrivals, still very unsettled in their 
new environment. Younger people, who had 
grown up under the new regime, predominated 
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in our group: 19 subjects were in their twenties, 
19 in their thirties and only 10 older than forty. 
The middle occupational levels (white collar 
workers, technicians) predominated, with 23 sub- 
jects; the other half of the group were evenly 
divided between the professional or managerial 
level, 12, and that of skilled and unskilled laborers, 
including a few peasants, 13. The distribution of 
educational levels followed the same pattern of 
concentration in the middle (high school) level, 
22, but the college graduates predominated over 
those who had had only 1-4 years of schooling, 
16 and 10 subjects, respectively. About half of 
the subjects had been affiliated in the Soviet 
Union either with the Party or with the Commu- 
nist Youth Organization. 

The American group was matched to the Rus- 
sian group, subject by subject, in terms of age, 
sex, educational level, and occupational grouping. 
All subjects came from homes where English was 
spoken, although, in a few cases, not as the only 
language; about half of them were born in Massa- 
chusetts, the rest being divided between other 
eastern states, the South and the Middle West. 
Naturally the two groups could not be matched in 
terms of their life experiences, recent and past. 
In both groups the participation in the study was 
voluntary, and the subjects were mostly recruited 
through organizations to which they belonged, 
and were paid for their time. However, for the 
Russians, the chance to tell the Americans about 
their lives and their grievances made participation 
a much more:vital experience than it was for the 
American subjects. 

The subjects were given the eight projective 
questions used in the California studies, and one 
additional question. (Because of some accidental 
omissions the number of Russian subjects varies 
for different questions between 45 and 48.) The 
questions were asked in the subjects’ native lan- 
guage, with probing used when necessary, and 
with no time pressure. The answers were re- 
corded by the tester. The test was preceded by 
interviews—very extensive ones in the case of the 
Russians—so that rapport had been well estab- 
lished. Most of the subjects responded to the 
questions naturally and willingly. 

The content categories for each question were 
drawn up by the senior author on the basis ofa 
preliminary survey of responses. Since these cate- 
gories were purely factual, their application de- 
manded no interpretation of Tesponses and pre- 
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sented no particular difficulties; coding as done 
by the two authors produced very consistent re- 
sults. If the discussion of a question by a subject 
contained several separate points they could be 
entered under different content categories; the 
term “response” as used by us applies to these 
units of content. The Russians were more voluble 
than the Americans and gave a higher total num- 
ber of responses in 7 out of 9 questions. 

The statistical comparison of the two groups 
was done in terms of number of subjects giving 
responses in a given content category; the signifi- 
cance of the differences between the groups was 
tested by means of Zubin’s nomographs (3).* 
The results of the content analysis will be pre- 
sented mainly in terms of these comparisons, but 
some reference will be made also to the distribu- 
tion of the total responses of each group over the 
various content categories. Whenever the con- 
tent categories bear some resemblance to the spe- 
cific qualitative variables that were mapped out in 
the California studies, these relationships will be 
indicated. 


RESULTS 


Question I. All of us are sometimes in a bad 
mood. What feelings do you think are most un- 
pleasant for you; which of them irritate (upset) 
you most of all? + 

Of all nine questions this one elicited from the 
Russians the largest number of responses referring 
to vital deprivations and to insecurities inherent 
in their past and present situation: such references 
were present in two thirds of the total Russian re- 
sponses. This high incidence is probably due both 
to the nature of the question and to its position 
at the beginning of the series: our subjects utilized 
this first question for pouring out their most press- 
ing concerns. Responses in one or in both of the 
closely related categories: “frustration of vital 
needs” and “negative external conditions” were 
made by 28 Russians and by 8 Americans.’ The 


8 We wish to thank Dr. T. Alper for her advice 
and help with the statistical work. 

* In most of the questions we simplified the original 
formulations to make them more understandable to 
the Subjects of lower educational level in both na- 
tional groups; consequently the version given here is 
somewhat different from that used in the California 
studies. 

5 This difference is significant at the .01 level of 
probability. Such differences will be indicated in the 
text by double asterisks, those significant at the .05 
level by single asterisks. 
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category “no prospects in the future” was used by 
the Russians only (9 subjects**). $ 
Some of the most frequent Russian responses 
given in these categories refer to lack of work 
and of vital necessities, to worry about the fate of 
the relatives who stayed in the USSR, and about 
one’s own uncertain future. This last worry was 
increased by the fear of the Soviets’ invading 
Germany which was widespread in Europe at the 
time of the study: such an event would spell dis- 
aster to the nonreturnees. Following are some 
illustrative quotations from the records: 


I am most disturbed when there is nothing 
to eat, no money to buy clothing, particularly 
if one of the children is sick. When all are in 
good health you can bear these lacks more 
easily. Right now I ought to buy shoes for my 
boy—he has not been able to go out of the 
house for a week. But if we buy clothing we 
need, not enough would be left for food. 

All the moral and physical sufferings that 
result from our abnormal life: absence of regu- 
lar work which would provide personal satis- 
faction and the material basis for living. This 
problem is with us always: this futile chase 
after work. ‘ 

‘At home it was fear of getting involved with 
the NKVD—how to escape them. Now I am 
disturbed by the feeling that war might start 
any day, that I might wake up and find the Red 
Army here; and this is a fear not just for myself. 


The American discussions contain no compa- 
rable responses. Those few answers that have 
been categorized under “negative external condi- 
tions” refer not to vital but to trivial or even freak- 
ish situations: “I am upset when people smoke in 
my car”; “When somebody pours cold water on 
my back while I am asleep.” Mentions of this 
kind of “external condition” are interpreted in the 
California studies as projections of neurotic anx- 
iety into some irrelevant situations; this interpre- 
tation is much less applicable to the responses of 
the Russian group which, in terms of the Cali- 
fornia variables, seem rather to indicate concern 
with “threatening or non-supporting environment.” 

The next largest category for the Russians is 
that of “interpersonal situations” (24 Russians, 
16 Americans*). It comprises two distinct sub- 
categories which show a clear-cut differentiation 
between the two groups. The majority of the 
Russian responses refer to ‚personal disharmony 
as an interactive relationship in which the subject 
feels himself participating, be it in the active or in 
the passive role (23 Russians, 6 Americans**). 


203 


The responses of the Americans refer more often 
to moods and traits of other people (grouchy peo- 
ple, overbearing people, irresponsible drivers), 
with the subject himself as an outsider, suffering 
from these offensive traits (2 Russians, 11 Ameri- 
cans**). 

In terms of California variables the Russian 
responses may be viewed as indicating “libidinized 
interpersonal relationships.” This interpretation 
is also supported by the frequency with which 
the family is mentioned, whether in terms of long- 
ing, of concern for their well-being, or of being 
upset by discord (12 Russians, 1 American**). 
Furthermore the Russian responses also contain 
references to injustice and to the sufferings of 
other people, particularly of the Russian people 
under the Soviet regime, and to the lack of under- 
standing of their situation by the West: “When 
one does not understand the Russians, when one 
confuses Russia with the Soviet Union—I will 
argue till I drop unconscious that the Russian 
people are not, as a whole, infected with Com- 
munism.” These semi-ideological responses based 
on identification with a group have no counterpart 
in the American discussions (13 Russians**). 

The category most frequently used by the Amer- 
icans is that of “own moods.” While the Rus- 
sians use it with almost equal frequency (17 Rus- 
sians, 21 Americans), the specific content of the 
responses of the two groups shows as much dif- 
ference as in the case of “external conditions” and 
of “interpersonal situations.” The Russians talk 
about longing, loneliness, fear, and moral suffer- 
ings; they never mention boredom, irritability, and 
restless tension which function prominently in the 
American responses. These objectless moods form 
a transition to the category of “bodily conditions” 
which is used by the Americans only and which 
contains complaints about poor health, sleepless- 
ness, and—most frequently—fatigue (10 Ameri- 
cans**). In terms of California variables such 
references to bodily conditions are assumed to ex- 
press “ego-alien trends,” such as passivity, anxiety, 
and hostility. On the other hand, the Americans 
also exceed the Russians, though not significantly 
so, in the category “frustration of achievement” 
(10 Americans, 5 Russians), a type of response 
which is supposed to reflect conscious conflict and 
self-criticism. 

Question 2. All of us have some desires which 
we try to suppress. Which desires and feelings do 
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you find most difficult to control, most difficult to 
suppress? 

The Russians made some references to specific 
conditions of their present and past, talking, e.g., 
about how difficult it was to suppress the wish to 
fight, to protest against the Soviet regime, or to 
banish painful thoughts about the relatives left 
behind. However, such references were much 
less frequent than in Question 1. Apart from 
this special category and from the Russians’ 
greater difficulty in understanding the question, 
the distribution of the major content was fairly 
similar in the two groups. 

In each group one third of the subjects made 
references to difficulties in the “control of primi- 
tive impulses.” Within this category the Russians 
mentioned aggression and drinking with equal fre- 
quency, while for the Americans the concern with 
hostile impulses predominated over all others (7 
Russians, 13 Americans). In terms of the Cali- 
fornia variables this might mean the prevalence of 
“non-focal aggression” in the American responses, 
and a greater frequency of the “incidental pleas- 
ures” variable in those of the Russians. How- 
ever, the difference between the two groups was 
significant only in the subcategory of drinking, 
which was used only by the Russians (7 Rus- 
sians*). 

The Americans used two categories that were 
absent in the Russian material. The first com- 
prised the wishes “to quit,” “to go off,” “to do 
what I want to do and not what I must”: to get 
away from the obligatory routine (7 Americans*). 
Such responses are interpreted in the California 
studies as indicative of “ego-alien passivity.” The 
second category referred to the difficulty the sub- 
jects had in suppressing other disturbing feelings 
and moods (5 Americans*). The third minor 
category, which comprised “wishes for fame and 
material possessions,” showed no differentiation 
between the two groups (7 Russians, 7 Ameri- 
cans). 

Question 3. Whom do you consider to be Teally 
great people? What kind of people do you ad- 
mire? 

The bulk of the responses of both groups con- 
sisted of names of admired people, given with or 
without elaboration. They fall into two major 
categories: “artists and scientists” (named by 34 
Russians and 28 Americans), and “social leaders” 
(23 Russians, 28 Americans). A smaller propor- 
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tion of responses of both groups was given in 
terms of admired personal traits. 

The subcategories of “social leaders” yielded no 
significant differences between the two groups, 
though the Americans tended to favor liberal lead- 
ers, such as Lincoln and F. D. Roosevelt (13 
Russians, 20 Americans); the Russians, while also 
paying tribute to those “who worked and lived for 
the good of the people,” gave no less attention to 
names that connote national power, including the 
military (15 Russians, 8 Americans), It should 
be noted, however, that in the latter subcategory 
the Russian list of choices was headed by Peter 
the Great, who represents progress and reform 
as much as national power; among the purely 
military leaders the Russians gave prominence to 
General Kutuzov who is credited with defeating 
Napoleon and who stands for victorious defense 
of the motherland rather than for conquest. 

Pronounced differences between the two groups 
appear in the subcategories of “artists and scien- 
tists.” Scientists are mentioned by them with 
about equal frequency (7 Russians, 10 Ameri- 
cans), but while the Americans give only a slightly 
greater prominence to artists of all descriptions, 
the Russians mention the latter with much greater 
frequency (33 Russians, 13 Americans**). When 
the artists are subdivided into writers (including 
poets), musicians, and painters, the Americans 
are shown to distribute their choices among these 
groups evenly, while the Russians favor the writers 
(31 Russians, 7 Americans**), naming musicians 
much less frequently and painters hardly at all. 
The difference between the two groups with re- 
gard to admiration for writers appears even more 
striking if we take into account that the Russians 
gave 86 single responses in this category and men- 
tioned 34 different names, while the corresponding 
frequencies for the Americans are 10 and 8. 
Prominent among the writers and poets mentioned 
by the Russians were Pushkin, Tolstoi, Dostoev- 
sky, Lermontov, Yessenin; of the foreign writers 
Shakespeare, Victor Hugo, Jack London, and 
Mark Twain were mentioned more than once. 

The Russians’ great admiration for writers is a 
continued cultural tradition; it may indicate the 
high value they place on expression and represen- 
tation of human emotions, as compared with ob- 
jective understanding and with practical action. 
The greater stress placed on these latter by Ameri- 
cans is reflected in the relatively greater weight 
they give to science and to social leadership, in 
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their occasionally naming businessmen and sports- 
men whom the Russians do not mention at all, 
and in the nature of the character qualities they 
consider admirable. Within this latter category 
they give primacy to traits related to practical 
achievement, such as strength, capability, success, 
while the Russians stress more inward and “moral” 
attributes, such as sincerity and kindness. How- 
ever, the only difference within these minor cate- 
gories that reaches statistical significance is the 
greater frequency with which American subjects 
express admiration for their own relatives (2 
Russians, 9 Americans*). 

Question 4. Nearly every person says at times 
to himself: If this goes on I shall go out of my 
mind. What things can make people go out of 
their minds, lose their senses? 

In this question the frequency rank order of the 
three major categories is the same for both groups, 
although the actual frequencies and the specific 
content of responses are different in some cases. 
For both groups the main cause of mental dis- 
turbance lies in internal psychological conditions, 
in “emotional and mental stress” (26 Russians, 30 
Americans). In the category “external condi- 
tions” the Russians predominate over the Ameri- 
cans (20 Russians, 13 Americans), though not 
significantly so. This relationship is reversed in 
“disturbed personal relations” (4 Russians, 14 
Americans*). 

The differences between the responses of the 
two groups which appear in the subcategories are 
similar to those observed in Question 1. Of the 
Russian responses coded under “external condi- 
tions,” three-fourths refer to conditions of extreme 
pressures and threats which are not mentioned 
by the Americans (14 Russians**); on the other 
hand, the Russians never mention minor environ- 
mental disturbances such as noises, disorder, traffic 
jams, referred to in more than half of the Ameri- 
can responses (7 Americans**). The few Rus- 
sian remarks coded under “disturbed personal re- 
lations” refer to interpersonal discord, while the 
Americans take a more passive attitude in their 
mentions of being persecuted, criticized, exposed 
to a nagging wife or to a drunken husband, or of 
being isolated from “normal relationships.” It is 
interesting to note that while in Question 1 the 
Russians stressed the importance of personal rela- 
tions for their happiness or unhappiness more than 
did the Americans, in thinking about the causes 
of a more catastrophic breakdown the latter show 
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more concern than the former about the disastrous 
consequences of conflict and isolation. 

In the leading category of “emotional and men- 
tal stress,” the Russians talk mostly about stirring 
traumatic experiences which may overpower the 
person, such as terror, deep grief, or unhappy love 
(26 Russians, 11 Americans**); only the Rus- 
sians occasionally blame insanity on lack of con- 
trol of emotions or of impulses (5 Russians*). 
The Americans ascribe mental illness predomi- 
nantly to conflicts and frustrations created by in- 
ternal obstacles, by “blocks within oneself,” “no 
outlets for your emotions”: a type of explanation 
never given by the Russians (20 Americans**). 
This difference in the concept of mental stress can 
be traced in part to linguistic factors and to rela- 
tively superficial cultural influences. Since psy- 
choanalytically oriented theories have been con- 
demned by the “party line,” even the educated 
Russians have not been exposed to them as the 
American public has been. Terms like “repres- 
sion” or “defenses” are absent from their vocabu- 
lary and from their thinking; the popular term 
“frustration” which was the one most frequently 
used by the Americans in the category of emo- 
tional stress has no exact counterpart in Russian. 
However, the difference in the conception of emo- 
tional causes of insanity might also reflect some 
real differences of psychological functioning. The 
apprehension of the Russians that they will be 
disorganized by overstrong emotions might be a 
consequence of their greater emotional abandon 
and impulsiveness; such experiences could make 
them stress the necessity of deliberate control, of 
holding impulses and emotions in check. In 
terms of California variables, most of their re- 
sponses would fall under the category of “too 
much inner life.” The Americans, on the other 
hand, sound as if they were greatly hemmed in 
by automatic defenses against strong emotional ex- 
periences. To the extent that they possess an 
introspective orientation, and, the vocabulary for 
describing these “inner psychological states” of 
conflict and tension, they express their apprehen- 
sion of the consequences of this situation by 
ascribing insanity to frustration-and repression. 

Question 5. What do you think is thie) worst 
crime a person can comp” f S 

The main content gato 


which comprises 


one-half of the responses. off each: group is “mur- | 

der.” This category was subdivided, as in) the 

California studies, into ‚plain unqualified refer- 
\ N 
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ences to murder, and into responses that, by bring- 
ing in the motives for the act, or by some other 
qualifications and elaborations, go beyond the ref- 
erence to the external act as such. The first 
subcategory showed no difference between the two 
groups (20 Russians, 20 Americans); however, 
more Russians than Americans gave responses in 
the qualified subcategory, such as “murder com- 
mitted for financial gain,” or “murder by the gov- 
ernment” (16 Russians, 5 Americans**). 

References to stealing and deceit were made by 
Russian subjects only (9 Russians**). References 
to “offenses against oneself and one’s values” 
were made only by Americans (5 Americans*). 
Betrayal of friends or country, cruelty, and sexual 
offenses were mentioned by both groups with 
equally low frequency. 

Question 6. All of us get sometimes into situa- 
tions when we feel very much ashamed and em- 
barrassed. What experiences make you feel like 
sinking through the floor? 

The bulk (three-fourths) of each group’s re- 
sponses to this question fall into three categories: 
“violation of moral values,” “inadequacy,” and 
“violation of conventions.” The first refers to 
failure to live up to one’s obligations, promises, 
or to moral standards of behavior; the formula- 
tion often implies that the person was ashamed 
of what he had done, and not merely humiliated 
by exposure. The second category refers to ex- 
Periences of personal inadequacy and inability, 
either in work or in interpersonal situations. Re- 
sponses in the third category show a preoccupation 
with conventional rules of behavior and etiquette 
and a fear of appearing ridiculous: having one’s 
clothes in disorder while in public, making an in- 
voluntary faux pas in using a foreign language. 
This category may be considered as the Opposite 
of the first, because the emphasis here is shifted 
from essential values to superficies of behavior, 
and from one’s own concern with one’s action to 
the unfavorable reaction of others. The category 
of “inadequacy” occupies a middle position be- 
tween the two. Some responses in this category 
reflect the person’s own concern with achievement 
and personal adequacy, but more often than in the 
case of “moral values” the cause of painful em- 
barrassment lies in the exposure of one’s inade- 
quacies to others. 

The distribution of responses over these cate- 
gories is very different for the two groups. While 
the Americans use the three Categories with ap- 


CONTEMPORARY RESEARCH IN PERSONALITY 


proximately equal frequency, the Russians refer 
most frequently to moral values (30 Russians, 13 
Americans**), and very rarely to conventions (6 
Russians, 16 Americans*). This distribution sug- 
gests that the Russians are less afraid of exposure 
of their personal weaknesses to others, and experi- 
ence less social anxiety than do the Americans. 
No difference was found between the two groups 
in the frequency of the use of the category “in- 
adequacy” (13 Russians, 17 Americans), but its 
specific content tends to be different. The Ameri- 
cans talk almost exclusively about intellectual in- 
feriority, mistakes, or failure to work, being 
proven wrong or criticized in front of others; the 
DPs’ feelings of personal inadequacy are often 
aroused by having no work and no earnings, by 
being forced to ask for help. In “violation of 
moral values” both groups equally emphasize ly- 
ing and dishonesty, with a minor focus on “hurt- 
ing people” in other ways, but while the Ameri- 
cans express these concerns in a rather impersonal 
or self-contained way (“failure to live up to prom- 
ises,” “to discover I was offensive”), the Russians 
often make the situation concrete by referring to 
specific acts and to particular people involved; 
they feel regret, e.g., if they had let down a friend, 
or had been rude to an old person. As in Ques- 
tion 5, occasional references to stealing are made 
only by the Russian subjects. 

“Question 7. Suppose you knew you had only 
six months to live, but during that time you could 
do whatever you liked—how would you spend 
your time? 

This question invites the subjects to explore in 
what direction his wishes would take him if all 
obstacles to his actions, including their long-term 
consequences, were removed. The question also 
evokes reactions to imminent death. In both re- 
spects there are pronounced differences between 
the two groups. 

References to “enjoyment, pleasure” form the 
main category for ooth groups (28 Russians, 29 
Americans) ; however, the distribution of responses 
within this category is quite different. In the 
Russian group references to frankly sensual pleas- 
ures, such as eating, drinking, and sex, outweigh 
almost two to one the mentions of such pleasures 
as travel, enjoyment of arts and sports (sensual 
pleasures: 22 Russians, 8 Americans**). In 
terms of California categories the predominant 
Russian responses belong under “open sensuality 
and active pleasure.” Conversely among the 
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Americans, references to what one subject termed 
“the finer things of life” predominate over the 
cruder pleasures in the relationship of three to 
one, “travel” being their most frequent single 
response (11 Russians, 25 Americans**). These 
responses have a marked affinity to the “inci- 
dental, dilute pleasures” of the California classifi- 
cation. 

Other differences are less pronounced. The 
Russians give more responses in the category of 
“religious and moral preoccupations,” expressing 
these concerns either in pure form or as an in- 
tegral part of some conflict about pleasure seeking 
or impulse gratification (16 Russians, 6 Ameri- 
cans*). They also tend to dwell more on the 
“affiliative ties,” if this category is made to in- 
clude not only the wish to spend time with family 
and friends, but also the wish to see Russia once 
more (14 Russians, 9 Americans). Neither group 
expresses any desire to utilize the time left for 
contributing to social welfare, to mankind in gen- 
eral; however, some of the Russians wish to re- 
lieve the suffering of the Russian people, e.g., by 
committing a decisive terroristic act, such as killing 
Stalin. The Americans emphasize personal 
achievement slightly more than do the Russians, 
particularly in the form of a wish to continue and 
complete some piece of work. 

In expressing their reaction to imminent death, 
subjects in both groups comment that this knowl- 
edge might deprive them of any enjoyment and 
enterprise. However, while the Russians occa- 
sionally talk about “going crazy” or “sitting and 
crying,” of committing suicide, the Americans 
often deny the emotional upset explicitly or by 
implication, maintaining that they would carry on 
as usual; some of them express the wish to keep 
their fate secret from their families and friends. 

Question 8. We have the feeling of awe when 
something seems to us remarkable, beautiful, or 
important. What things or events inspire such 
feelings in you? 

Even though an elucidation of the word “awe” 
was included in the question, this term was fre- 
quently not understood, or misunderstood, par- 
ticularly by the Russians. They often gave it 
the meaning of “admiration” and occasionally the 
meaning of “pleasure” or “delight.” Responses 
referring to such events as being given a chance 
to emigrate are caused by this misunderstanding 
which possibly also accounts for some of the dif- 
ferences in the responses of the two groups: 
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In the American group the leading category is 
that of “individual achievement” (10 Russians, 28 
Americans**). In more than half of the instances 
the Americans specify this achievement as scien- 
tific or artistic, but—unlike the Russians—they 
also make references to outstanding performance 
in technology and in sports (1 Russian, 10 Ameri- 
cans*), while the Russians refer to arts and sci- 
ences almost exclusively. The second place for 
the Americans is occupied by “nature,” which in- 
cludes also references to life processes (6 Rus- 
sians, 16 Americans*). In the Russian group the 
leading category is the “social-interpersonal ”: the 
subjects feel moved by warm personal relations 
and altruistic acts, as well as by events of wider 
social significance such as the triumph of freedom 
(15 Russians, 7 Americans*). Responses ex- 
pressing “national pride” were equally infrequent 
in both groups (5 Russians, 7 Americans). 

Within the categories used by both groups, the 
formulations of the Americans tend to be more 
schematic and conventional than those of the 
Russians, indicating a less intense and personal 
mode of experience. Thus in the category of 
“nature and life” most of the American responses 
are so brief and stereotyped that they must be 
described as “dilute experiences” in terms of the 
California categories. The Russian responses are 
much more concrete and detailed, more suffused 
with personal meaning and emotion, and conse- 
quently may be considered as representing “in- 
tense nature experiences.” 

Question 9. Suppose you had a child, and you 
knew that you might wish three things for him and 
that your wishes would be fulfilled—what would 
you wish for? ® 

Apart from the fact that the Americans gave 
many more general unspecified responses, such as 
“happiness” (5 Russians, 22 Americans**), the 
wishes for the child were fairly similar in the two 
groups. The largest category contained wishes 
for a “successful life” or for a life lived under 
favorable conditions (30 Russians, 34 Ameri- 
cans); in this category the Russians stressed par- 
ticularly the importance of education and of 
specialized training as prerequisites for the “good 
life” (17 Russians, 9 Americans). The Ameri- 
cans, on the other hand, placed a greater empha- 


6 This question has been used in T. Dembo’s study, 
“Investigation of Concrete Value Systems,” U. S. 
Public Health Service, Institute of Mental Health. 
Final Report (Unpublished). 
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sis on wealth, on financial status (8 Russians, 16 
Americans*). 

While both groups wished for good health for 
the child (20 Russians, 27 Americans), the Rus- 
sians emphasized no less strongly mental quali- 
ties (18 Russians, 12 Americans) and moral or 
character traits (23 Russians, 11 Americans**) 
to which the Americans paid less attention. In 
speaking of mental qualities, the latter mentioned 
intelligence almost exclusively; the Russians di- 
vided their wishes between intelligence and spe- 
cial talents, particularly the artistic ones (9 Rus- 
sians, 2 Americans*), The Americans showed 
slightly more concern than the Russians with the 
child’s social relations and also made occasional 
references to “full life,” to “secure and harmoni- 
ous development,” which had no counterpart in 
the Russian responses. The Russians occasionally 
expressed the wish that the child should grow up 
to love Russia, and be able to apply his talents in 
Russia. 


DISCUSSION 


in reviewing the trends revealed in the responses 
to the nine questions, we shall first organize the 
discussion in terms of the positive and negative 
values held by the two groups, and then consider 
the implications of our findings for some aspects 
of the theory of the “authoritarian personality.” 
We shall use the term “value” in a wide sense, 
including not only explicit evaluations, but also 
consistently positive or negative emotional reac- 
tions of a more immediate kind, as they are ex- 
pressed both in the content and in the formulation 
of responses. It is well to remind the reader at 
this point that the generalizations that follow are 
based merely on the differential frequencies of 
Tesponses in some of the content categories and 
that many of these differences are small, or occur 
within minor, infrequently used categories. In 
several questions there is a considerable similarity 
between the two groups, at least in the distribution 
of the major content categories. Furthermore, 
some of the differences that do appear are obyi- 
ously created or enhanced by the drastic differ- 
ences in their life situations both in the present 
and in the recent past, and cannot be ascribed 
exclusively to differences in permanently held 
values. On the other side of the ledger, it is to 
be noted that many of the differences observed 
are not explicable in situational terms alone, and 
that several of the intergroup differences appear 
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quite consistently from question to question, even 
though they may reach the level of statistical sig- 
nificance only in some of them. 

In reviewing those questions that elicit expres- 
sions of positive values (Questions 3, 7, 8, 9), we 
find that two major values are more prominent in 
the Russian records than in the American ones. 
The first one is the value of emotional experiences 
and of frank sensual pleasures. The Russians’ 
positive acceptance of these experiences is ex- 
pressed both in the content of their responses and 
in the language they use, which is more emotion- 
ally expressive and more concretely descriptive of 
sensual impressions than that used by the Ameri- 
can subjects. Even more pronounced is the Rus- 
sians’ high valuation of interpersonal relation- 
ships, their positive and active acceptance of in- 
teraction and of belongingness with others. This 
attitude is expressed in most of the questions, 
both in an immediate emotional fashion and on 
an ideological level. The other two features more 
prominent in the Russian than in the American 
Tesponses—moral values and patriotic feelings— 
also have strong interpersonal implications for our 
Subjects. Patriotism is often expressed in the con- 
text of wishing to help the Russian people; moral 
values, such as sincerity and honesty, are seen as 
prerequisites of friendship. 

The value which the Russians emphasize less 
strongly than do the Americans is first and fore- 
most that of individual achievement; the latter 
mention it more frequently than do the former in 
practically all contexts that evoke discussions of 
this topic. Within this category what seems dis- 
tinctively American is the high valuation of ra- 
tional knowledge and of organizational and physi- 
cal achievement. A second positive category in 
which the Americans have the advantage is an 
infrequent one but it has practically no Russian 
Tepresentation. This category includes responses 
emphasizing independence and self-expression of 
the individual and a protest against their violation; 
the concept of “crime against oneself” (e.g., “de- 
Ceiving oneself”) also belongs to this category, 
which could be termed “integrity of the individ- 
ual.” 

In response to inquiries about negative values 
and sources of negative experiences (Questions 1, 
2, 4, 5, 6), the Russians much more often than 
the Americans see causes of unhappiness in se- 
verely depriving conditions of life, conditions that 
Obstruct satisfaction of vital needs or threaten 
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one’s life or safety. The other two categories in 
which their responses by and large exceed those 
of the Americans refer to disturbance of interper- 
sonal relationships and to violation of moral val- 
ues. These three categories, which can be viewed 
as the obverse of the Russian positive values, be- 
long to three different spheres: external, interper- 
sonal, and ethical. Yet all of these categories 
refer to vital conditions of human existence and 
imply the person’s essential interrelatedness with 
his environment. Unhappiness is seen by the 
Russians as resulting from the disruption of this 
relationship either by the “environment” (depriv- 
ing conditions) or by the person (moral viola- 
tion), or by both (interpersonal discord). 

In contrast to this relative homogeneity, the 
negative conditions that are mentioned by the 
comparison group more often than by the Rus- 
sians fall into two major categories which seem 
to be quite different in their psychological mean- 
ing. On the one hand, consistent with their posi- 
tive values of individual achievement and indi- 
vidual integrity, the Americans locate causes of 
unhappiness within the person himself: in his 
conflicts, inhibitions, in his feelings of inadequacy 
and failure. These responses presuppose a self- 
reflective, self-critical attitude which is less promi- 
nent among the Russians. On the other hand, we 
find numerous references to inessential, often 
trivial conditions which have little relation to 
basic human needs, and which are located “out- 
side”: in other people, in things, in the body. 
Annoyances produced by noises, bossy people, in- 
somnia, breaches of etiquette, or impulsive wishes 
to “go off” are likely to be merely symptoms of 
disturbances caused by unconscious conflicts. 
Such responses indicate a defensive alienation 
from one’s feelings which is the opposite of in- 
sight. Yet both categories have the common fea- 
ture of implying the person’s separateness rather 
than close relatedness with his environment. 

Thus a review of the distinctive values of the 
two groups suggests that the Russians are more 
firmly and securely integrated with their environ- 
ment than are the Americans, who are more 
keenly aware of the individual’s separateness and 
isolation. Because the most significant part of 
our environment is people, it is understandabie 
that the most pronounced differences between the 
two groups appear in the interpersonal area. 

Considering our findings in terms of the sub- 
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variables of the equalitarian and authoritarian per- 
sonality patterns, we arrive at the following for- 
mulations. The equalitarian patterns of the Rus- 
sians are based to a greater extent than those of 
the Americans on the area of interpersonal rela- 
tionships, while those of the Americans stem from 
their concern with the rights and achievements of 
the individual. Within the area of inner experi- 
ence, the Russians’ equalitarian personality struc- 
ture is attested to by their acceptance of impulses 
and emotions, and by moral self-reproach, while 
the corresponding American pattern finds expres- 
sion in a more rational and self-critical introspec- 
tive attitude and in feelings of inadequacy and 
rejection. The authoritarian patterns of the two 
groups also show differences but along different 
lines. If one were to score all Russian responses 
according to the specific rules of the California 
study which have been worked out for the Ameri- 
can groups, a greater proportion of the Russian 
than of the American “high” scores would be 
earned through “moralizing” discussions and 
through expressions of patriotic sentiments—i.e., 
through their ideological attitudes. The high 
scores of the Americans are more often founded 
on defensive personality aspects, such as aliena- 
tion from oneself and from others and displace- 
ment of emotions into irrelevant situations. 
These generalizations must be qualified by some 
further observations which are pertinent to the 
status of ideological attitudes as indicators of 
personality patterns. The wording and the con- 
text of our subjects’ responses suggest that cer- 
tain explicit evaluations do not have the same 
significance for the two national groups, and this 
impression is confirmed by their differential corre- 
lations with the over-all scores of authoritarian- 
ism. Thus the Russians’ traditional admiration 
for great writers does not seem to be diagnostic 
of personal equalitarianism to the same degree as 
it is for the Americans, nor do their feelings for 
their country appear to indicate ethnocentric au- 
thoritarian patriotism. Insofar as the choice of 
particular values and opinions, even when guided 
by personal motives, depends also on their preva- 
lence and meaning in the cultural environment, 
such differences are to be expected. The more 
direct dependence of ideological attitudes on the 
environment makes them less valid as indicators 
of either “high” or “low” patterns than are the 
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more deeply ingrained personal traits, particularly 
when subjects of different cultural backgrounds 
are being compared. Since our delineation of 
the difference in the authoritarian patterns of the 
two groups is based in part on the Russians’ ideo- 
logical pronouncements, it appears less valid than 
the generalizations that concern the intergroup 
differences in equalitarianism.? 


? The formulations concerning the equalitarian pat- 
terns are also borne out by the evidence of some 
other tests we used in the study. In comparing the 
subjects’ scores on different tests J. Orton found this 
evidence to be particularly clear-cut on the point of 
“interpersonal” vs. individual approach. One of the 
items in the Episodes Test (2) depicted a conflict 
between the individual and the group of which he is 
a member. In discussing this situation, the Russians 
in general identified with the group more strongly 
than with the individual, but this tendency was even 
stronger among those whose scores of authoritarian- 
ism on the Projective Questions were low than among 
the “highs.” The Americans as a group displayed a 
stronger identification with the individual, and here 
the “lows” did so to a greater extent than the “highs.” 
To understand this paradox, one must take into ac- 
count that the Russians identified with the group in 
a very positive, participating way, while the majority 
of the Americans described the group as a coercive 
agent to be either fought with or yielded to, i.e., in 
terms of an irrational authority. 

Some evidence on the “emotional” aspect of the 
“low” Russian pattern was obtained from the tech- 
nique of “short answer items,” used with the Russian 
group only. The responses of the Russians to these 
questions, most of which pertained to various areas 
of Soviet life, were scored in terms of the four scales 
of Flattery, Distortion, Rejection of the basic Soviet 
institutions, and general Anti-Soviet sentiment. Flat- 
tery (i.e. opinions favoring the Americans as, for 
example, over the British) was higher among the 
“highs” than among the “lows”: the former are more 
eager to please the momentary authority. For the 
rest of the variables the relationship was reversed. 
The “lows” not only expressed a much stronger anti- 
Soviet sentiment than the “highs,” but also went much 
further in rejecting such widely accepted institutions 
as the state's ownership of industry, and even in deny- 
ing such recognized achievements of the regime as 
increase in literacy and in production of farm equip- 
ment (Distortion). It would seem that respect for 
either factual objectivity or for achievement as such 
is not a component of the Russian “low” pattern. 
Although these subjects would not falsify their evalu- 
ation of the American merits and failings while talk- 
ing to us, for them to pass an objective judgment on 
the instrumental achievements of the Soviets would 
violate the emotional truth in a more important way 
than would detached objectivity. This interpretation 
is borne out by their frequent alternative reaction to 
inquiries about the Soviet achievements: “Yes, but at 
what price—and to what end!” 
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By comparison with the attitudes of Ameri- 
cans, the equalitarian traits of the Russians ap- 
pear to be more pronounced in the interpersonal 
than in ‘the “individual” area, and in the emo- 
tional rather than in the intellectual sphere. 
What implications do these findings have for the 
general theory of authoritarianism? Primarily 
they serve to confirm the existence of subvariants, 
at least within the equalitarian pattern, and to 
identify the two variants that are distinctive of 
the two national groups: one centered on inter- 
personal relatedness and immediate emotional 
awareness, the other on the integrity of the indi- 
vidual and on rational self-reflection. Though 
these two patterns reflect different value-orienta- 
tions, and possibly also different levels of sophis- 
tication, within the theoretical framework of the 
California studies they may be considered as 
equally expressive of genuine equalitarianism. 
It is a task for future studies to verify or disprove 
the existence of these and of other possible pat- 
terns by a systematic investigation of the corre- 
lations between the many specific traits that are 
considered as diagnostic of equalitarianism and 
of authoritarianism. 


SUMMARY 


The projective questions used in the California 
studies of authoritarianism were administered to 
a group of former Soviet citizens and to a com- 
parison group of Americans. The responses were 
coded according to their content. The findings 
are discussed in terms of the emotional and evalu- 
ative attitudes prevalent in each group, and their 
implications for the theory of authoritarian per- 
sonality are pointed out. 
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CONFORMITY AND CHARACTER * 


RICHARD S. CRUTCHFIELD * 


During the Spring of 1953, one hundred men 
visited the Institute of Personality Assessment and 
Research at the University of California, Berke- 
ley, to participate in an intensive three-day assess- 
ment of those qualities related to superior func- 
tioning in their profession.? 

As one of the procedures on the final day of 
assessment, the men were seated in groups of five 
in front of an apparatus consisting of five adja- 
cent electrical panels. Each panel had side wings, 
forming an open cubicle, so that the person, 
though sitting side by side with his fellow sub- 
jects, was unable to see their panels. The ex- 
perimenter explained that the apparatus was so 
wired that information could be sent by each man 
to all the others by closing any of eleven switches 
at the bottom of his panel. This information 
would appear on the other panels in the form of 
signal lights, among five rows of eleven lights, 
each row corresponding to one of the five panels. 
After a warm-up task to acquaint the men with 
the workings of the apparatus, the actual proce- 
dure commenced. 

Slides were projected on a wall directly facing 
the men. Each slide presented a question calling 
for a judgment by the person. He indicated his 
choice of one of several multiple-alternative an- 
swers by closing the appropriately numbered 
switch on his panel. Moreover, he responded 
in order, that is, as designated by one of five red 
lights lettered A, B, C, D, E, on his panel. If he 
were A, he responded first, if B, second, and so 
on. The designations, A, B, C, D, and E, were 
rotated by the experimenter from time to time, 
thus permitting each person to give his judgments 
in all the different serial positions. No further 


* Reprinted by permission from The American Psy- 
chologist, May, 1955, Vol. 10, No. 5, 191-198. 

1 Adapted from the address of the retiring president 
of the Division of Personality and Social Psychology, 
American Psychological Association, New York City, 
September 4, 1954. 

2 The principal study reported here owes much to 
the collaboration of Dr. Donald W. MacKinnon, di- 
rector of the Institute of Personality Assessment and 
Research, and of his staff. Mr. Donald G. Wood- 
worth has contributed especially to the statistical 
analysis of data. 


explanation about the purpose of this procedure 
was offered. 

It may help to convey the nature of the men’s 
typical experiences by giving an illustrative de- 
scription of what happens concretely to one of 
the men. The first slide calls for a simple judg- 
ment of which of two geometrical figures is larger 
in area. Since his red light C is on, he waits for 
A and B to respond before making his response. 
And, as he is able to observe on the panel, his 
own judgment coincides with the judgments of 
A and B who preceded him, and of D and E who 
follow him. After judgments on several further 
slides in position C, he is then shifted to position 
D for more slides, then to A. 

The slides call for various kinds of judgments 
—lengths of lines, areas of figures, logical com- 
pletion of number series, vocabulary items, esti- 
mates of the opinions of others, expression of 
his own attitudes on issues, expression of his per- 
sonal preferences for line drawings, etc. He is 
not surprised to observe a perfectly sensible re- 
lationship between his judgments and those of 
the other four men. Where clear-cut perceptual 
or logical judgments are involved, he finds that 
his judgments are in perfect agreement with those 
of the other four. Where matters of opinion are 
involved, and some differences in opinion to be 
expected, his judgments and those of the other 
four men are sometimes in agreement and some- 
times not. 

Eventually the man finds himself for the first 
time in position E, where he is to respond last. 
The next slide shows a standard line and five 
comparison lines, of which he is to pick the one 
equal in length to the standard. Among the 
previous slides he has already encountered this 
kind of perceptual judgment and has found it 
easy. On looking at this slide it is immediately 
clear to him that line number 4 is the correct one. 
But as he waits his turn to respond, he sees light 
number 5 in row A go on, indicating that that 
person has judged line number 5 to be correct. 
And in fairly quick succession light 5 goes on 
also in rows B, C, and D. 

At this point the man is faced with an obvious 
conflict between his own clear perception and a 
unanimous contradictory consensus of the other 
four men. What does he do? Does he rely on 
the evidence of his own senses and respond inde- 
pendently? Or does he defer to the judgment 
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of the group, complying with their perceptions 
rather than his own? 

We will postpone for a moment the answer as 
to what he does, and revert to the description of 
our apparatus. 

We have been describing the situation as if 
seen from the perspective of one of the men. 
Actually his understanding of the situation is 
wrong. He has been deceived. For the appa- 
Tatus is not really wired in the way that he was 
informed. There actually is no connection 
among the five panels. Instead, they are all 
wired in an identical manner to a control panel 
where the experimenter sits behind the men. It 
is the experimenter who sends all the information 
which appears on the panels, and the wiring is in 
parallel in such a way that whatever signals are 
sent by the experimenter appear simultaneously 
and identically on all five panels. Moreover, the 
designations of serial order of responding—A 
through E—are identical at all times for the five 
panels, so that at a given moment, for instance, 
all five men believe themselves to be A, or at 
another time, E. 

As we have just said, the responses actually 
made by the five men do not affect in any way the 
panels of the others. They do get registered in- 
dividually on one part of the experimenter’s con- 
trol panel. The latency of each individual re- 
sponse to one tenth of a second is also recorded 
by timers on the control panel. 

Hence, the situation as we have described it 
for our one illustrative man is actually the situa- 
tion simultaneously experienced by all five men. 
They all commence in position C, and all shift 
at the same time to position D, and to A, and 
finally E. They all see the same simulated group 
judgments. 

The entire situation is, in a word, contrived, 
and contrived so as to expose each individual to 
a standardized and prearranged series of group 
judgments. By this means the simulated group 
judgments can be made to appear sensible and 
in agreement with the individual, or, at chosen 
critical points, in conflict with his judgments. 

Most of you will recognize at once the basic 
similarity of our situation to that invented by 
Asch (2) in his extremely important work of re- 
cent years on independence of individual judg- 
ment under opposing group pressure. In his 
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method, ten subjects announced aloud and in 
succession their judgments of the relative length 
of stimulus lines exposed before the group. The 
first nine subjects were actually confederates of 
the experimenter, and gave uniformly false an- 
swers at pre-established points, thus placing pres- 
sure on the single naive subject. 

For extensive research use, for instance in per- 
sonality assessment, Asch’s technique is handi- 
capped by the severely unfavorable ratio of con- 
federates to true subjects. The present technique, 
utilizing the electrical network described above, 
avoids this difficulty. There are no confederates 
required; all five subjects are tested simultane- 
ously in a thoroughly standardized situation. The 
experimenter exercises highly flexible control of 
the simulated group judgments, and of the serial 
order of responding. Stimulus material to be 
judged can be varied as widely as desired by use 
of different slides. 

Now at last come back to our man still sitting 
before his panel, still confronted with the spuri- 
ous group consensus, still torn between a force 
toward independent judgment and a force toward 
conformity to the group. How he is likely to 
behave in the situation can best be described by 
summarizing the results for our study of 50 of 
the 100 men in assessment. 


EFFECTS OF CONSENSUS 


All of these men were engaged in a profession 
in which leadership is one of the salient expected 
qualifications. Their average age was 34 years. 
Their educational levels were heterogeneous, but 
most had had some college training. 

Fifty of the men were tested in the procedure 
as described. Another 40 served as control sub- 
jects; they simply gave individual judgments of 
the slides without using the apparatus, and hence 
without knowledge of the judgments of others. 
The distribution of judgments of these control 
subjects on each slide was subsequently used as 
a baseline for evaluating the amount of group 
pressure influence on the experimental subjects. 

Now as to results. When faced with the di- 
lemma posed by this first critical slide, 15 of the 
50 men, or 30 per cent, conformed to the ob- 
viously false group consensus. The remaining 
70 per cent of the men maintained independence 
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of judgment in face of the contradictory group 
consensus. 

The first critical slide was followed by 20 
others, all with the subjects responding in posi- 
tion E. The 20 slides involved a broad sampling 
of judgmental materials, exploring the question 
of what would happen to other kinds of percep- 
tions, to matters of factual appraisal and of logic, 
of opinion and attitude, of personal preference— 
all under the same conditions of group pressure. 
Interpolated among them were occasional neutral 
slides, in which the group consensus was simu- 
lated as correct or sensible, in order to help main- 
tain the subjects’ acceptance of the genuineness 
of the apparatus and situation. 

The results on several more of the critical 
slides will give a representative picture of what 
happens under group pressure. First, take an- 
other kind of perceptual judgment. A circle and 
a star are exposed side by side, the circle being 
about one third larger in area than the star. The 
false group consensus is on the star as the larger, 
and 46 per cent of the men express agreement 
with this false judgment. 

On a simple logical judgment of completion of 
a number series, as found in standard mental 
tests, 30 per cent of the men conform to an 
obviously illogical group answer, whereas not a 
single control subject gives an incorrect answer. 

As striking as these influence effects are, they 
are overshadowed by the even higher degree of 
influence exhibited on another set of items. 
These pertain to perceptual, factual, and logical 
judgments which are designed to maximize the 
ambiguity of the stimulus. There are three such 
examples: (a) two actually equal circles are to 
be judged for relative size; (b) a pair of words 
are to be judged as either synonyms or antonyms, 
though actually entirely unrelated in meaning and 
unfamiliar to all subjects; (c) a number series is 
to be completed which is actually insoluble, that 
is, for which there is no logically correct com- 
pletion. 

To take the third example, which gives the 
most pronounced influence effect of all 21 critical 
items, 79 per cent of the men conform to a 
spurious group consensus upon an arbitrarily 
chosen and irrational answer. 


Influence effects are found, we see, on both 


213 


well-structured and poorly structured stimuli, 
with markedly greater effects on the latter. 

Turning from perceptual and factual judg- 
ments to opinions and attitudes, it is clearly evi- 
dent that here, too, the judgments of many of the 
men are markedly dependent upon a spurious 
group consensus which violates their own inner 
convictions. For example, among control sub- 
jects virtually no one expresses disagreement with 
the statement: “I believe we are made better by 
the trials and hardships of life.” But among the 
experimental subjects exposed to a group con- 
sensus toward disagreement, 31 per cent of the 
men shift to expressing disagreement. 

It can be demonstrated that the conformity 
behavior is not found solely for attitudes on 
issues like the foregoing, which may be of rather 
abstract and remote significance for the person. 
Among the control sample of men, not a single 
one expresses agreement with the statement: “I 
doubt whether I would make a good leader,” 
whereas 37 per cent of the men subjected to group 
pressure toward agreement succumb to it. Here 
is an issue relating to appraisal of the self and 
hence likely to be of some importance to the 
person, especially in light of the fact already men- 
tioned that one of the salient expected qualifica- 
tions of men in this particular profession is that 
of leadership. 

The set of 21 critical items ranges from fac- 
tual to attitudinal, from structured to ambiguous, 
from impersonal to personal. With only two 
exceptions, all these items yield significant group 
pressure influence effects in our sample of 50 
men. The very existence of the two exceptional 
items is in itself an important finding, for it 
demonstrates that the observed influences are not 
simply evidence of indiscriminate readiness to 
conform to group pressure regardless of the 
specific nature of the judgment involved, The 
character of the two exceptional items is signifi- 
cant, for they are the two most extremely per- 
sonal and subjective judgments, namely, those in 
which the individual is asked which one of two 
simple line: drawings he prefers. On these slides 
there is virtually no effective result of group pres- 
sure. Not more than one man of the 50 ex- 
presses agreement with the spurious group con- 
sensus on the nonpreferred drawing. Such per- 
sonal preferences, being most isolated from the 
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relevance of group standards, thus seem to be 
most immune to group pressure. 


INDIVIDUAL DIFFERENCES 


To what extent do the fifty men differ among 
themselves in their general degree of conformity 
to group pressure? 

A total “conformity score” is readily obtain- 
able for each individual by counting the number 
of the 21 critical items on which he exhibits in- 
fluence to the group pressure. The threshold for 
influence for each item is arbitrarily fixed on the 
basis of the distribution of judgments by control 
subjects on that item. 

Considering that we are dealing with a fairly 
homogeneous sample of limited size, the range 
of individual differences that we obtain is aston- 
ishingly large, covering virtually the entire pos- 
sible scope of our measure. At the lower ex- 
treme, several of the men showed conformity on 
no more than one or two of the critical items. 
At the upper extreme, one man was influenced 
on 17 of the 21 items. The rest of the scores are 
well distributed between these extremes, with a 
mean score of about eight items and a tendency 
for greater concentration of scores toward the 
lower conformity end. 

The reliability of the total score, as a measure 
of generalized conformity in the situation, is ob- 
tained by correlating scores on two matched 
halves of the items. The correlation is found to 
be .82, which when corrected for the combined 
halves gives a reliability estimate for the entire 
21-item scale of .90. 

To recapitulate, we find large and reliable dif- 
ferences among the 50 men in the amount of con- 
formity behavior exhibited, and there appears to 
be considerable generality of this conformity be- 
havior with ‚respect to widely varied judgmental 
materials. Whether such conformity tendencies 
also generalize to other, quite different behavioral 
situations is a question for future research. 


RELATIONS TO PERSONALITY VARIABLES 


Assuming that we are, indeed, measuring con- 
formity tendencies which are fundamental in the 
person, the question is what traits of character 
distinguish between those men exhibiting much 
conformity behavior in our test and those exhibit- 
ing little conformity. The assessment setting 
within which these men were studied provides an 
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unusually fertile opportunity to explore this ques- 
tion, in light of the wide range of personality 
measurements available. 

Correlational study of the conformity scores 
with these other variables of personality provides 
some picture of the independent and of the con- 
forming person. As contrasted with the high 
conformist, the independent man shows more in- 
tellectual effectiveness, ego strength, leadership 
ability and maturity of social relations, together 
with a conspicuous absence of inferiority feelings, 
rigid and excessive self-control, and authoritarian 
attitudes, 

A few correlations will illustrate. The assess- 
ment staff rating on “intellectual competence” 
correlates —.63 with conformity score, this being 
the highest relationship of any found, The Con- 
cept Mastery Test,’ a measure of superior men- 
tal functioning, correlates —.51 with conformity. 
An “ego strength” scale, independently derived 
by Barron (3), correlates —.33, and a staff rating 
on “leadership ability,” —.30 with conformity. 
Scales of Gough’s California Psychological Inven- 
tory (6), pertaining to such dimensions as “toler- 
ance,” “social participation,” and “responsibil- 
ity,” range in correlation from —.30 to —.41 
with conformity. 

And as for some of the positive correlates, the 
F scale (7), a measure of authoritarian attitudes, 
correlates +.39 with conformity, and a staff 
rating on amount of authoritarian behavior mani- 
fested in a standard psychodrama situation corre- 
lates +.35 with conformity. 

The general appraisal of each man by the 
assessment staff in the form of descriptive Q sorts 
further enriches this picture. Those men ex- 
hibiting extreme independence in the situation as 
contrasted with those at the high conformity end 
are described more often in the following terms 
by the assessment staff, which was entirely igno- 
rant of the actual behavior of the men in the 
group pressure procedure: 


‘ 
Is an effective leader. > 
Takes an ascendant role in his relations with 

others. 

Is persuasive; tends to win other people over 
to his point of view. 

Is turned to for advice and reassurance. 

Is efficient, capable, able to mobilize re- 
sources easily and effectively. 


3 Used with the kind permission of Dr. Lewis M. 
Terman. 
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Is active and vigorous. 

Is an expressive, ebullient person. 

Seeks and enjoys aesthetic and sensuous im- 
pressions. 

Is natural; free from pretense, unaffected, 

Is self-reliant; independent in judgment; able 
to think for himself. 


In sharp contrast to this picture of the inde- 
pendent men is the following description of those 
high in conformity behavior: 


With respect to authority, is submissive, com- 
pliant and overly accepting. 

Is conforming; tends to do the things that are 
prescribed. 

Has a narrow range of interests. 

Overcontrols his impulses; is inhibited; need- 
lessly delays or denies gratification. 

Is unable to make decisions without vacilla- 
tion or delay. 

Becomes confused, disorganized, and unadap- 
tive under stress. 

Lacks insight into his own motives and be- 
havior. 

Is suggestible; overly responsive to other peo- 
ple’s evaluations rather than his own. 


Further evidence is found in some of the spe- 
cific items of personality inventories on which the 
answers of the high and low conformers are sig- 
nificantly different. Here are some illustrative 
items more frequently answered “True” by the 
independent subjects than by the conforming sub- 
jects: 


Sometimes I rather enjoy going against the 
rules and doing things I’m not supposed to. — 

I like to fool around with new ideas, even if 
they turn out later to be a total waste of time. 

A person needs to “show off” a little now 
and then. x 

At times I have been so entertained by the 
cleverness of a crook that I have hoped he 
would get by with it. 

It is unusual for me to express strong ap- 
proval or disapproval of the actions of others. 

I am often so annoyed when someone tries 
to get ahead of me in a line of people that I 
speak to him about it. 

Compared to your own self-respect, the re- 
spect of others means very little. 


This pattern of expressed attitudes seems to 
reflect freedom from compulsion about rules, ad- 
venturousness (perhaps tinged with exhibition- 
ism), self-assertiveness, and self-respect. 

Turning to the opposite side of the picture, here 
are some illustrative items more frequently an- 
swered “True” by the extreme conformists, which 


reflect a rather rigid, externally sanctioned, and 
inconsistent, moralistic attitude. 


I am in favor of very strict enforcement of all 
laws, no matter what the consequences. 

It is all right to get around the law. if you 
don’t actually break it. 

Most people are honest chiefly through fear 
of being caught. 


Another set of items reveals a desire for clarity, 
symmetry, certainty, or, in presently popular 
phraseology, “an intolerance of ambiguity.” 


I don’t like to work on a problem unless there 
is a possibility of coming out with a clear-cut 
and unambiguous answer. 

Once I have made up my mind I seldom 
change it. 

Perfect balance is the essence of all good 
composition. 


Other items express conventionality of values: 


I always follow the rule: business before 
pleasure. $ 

The trouble with many people is that they 
don’t take things seriously enough. 

I am very careful about my manner of dress. 


Anxiety is revealed in numerous items: 


I am afraid when I look down from a high 
place. 

I am often bothered by useless thoughts 
which keep running througħ my head. 

I often think, “I wish I were a child again.” 

I often feel as though I have done something 
wrong or wicked. 


And, finally, there are various expressions of 
disturbed, dejected, and distrustful attitudes to- 
ward other people: 


When I meet a stranger I often think that he 
is better than I am. 

Sometimes I am sure that other people can 
tell what I am thinking. 

J wish that I could get over worrying about 
things I have said that may have injured other 
people’s feelings. 

I commonly wonder what hidden reason an- 
other person may have for doing something 
nice for me. 

People pretend to care more about one an- 
other than they really do. 


Although there is an unmistakable neurotic 
tone to many of the foregoing statements, one 
must be chary of inferring that those high on con- 
formity are measurably more neurotic than the 
others. There does not in fact appear to be any 
significant correlation of the conformity scores 
with obvious standard measures of neuroticism as 
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found, for instance, in scales of the Minnesota 
Multiphasic Personality Inventory. A similar 
negative finding has been reported by Barron (4) 
in his study of the personality correlates of inde- 
pendence of judgment in Asch’s subjects. 

In another area, attitudes concerning parents 
and children, differences between those high and 
low on conformity are especially interesting. The 
extreme conformists describe their parents in 
highly idealized terms, unrelieved by any sem- 
blance of criticism. The independents, on the 
other hand, offer a more balanced picture of praise 
and criticism. 

Most of the men in the sample are fathers, and 
it is instructive to see that in their view of child- 
rearing practices, the conformers are distinctly 
more “restrictive” in their attitudes, and the in- 
dependents distinctively more “permissive” (5). 

Finally, there appears to be a marked difference 
in the early home background of the conformists 
and independents. The high conformers in this 
sample come almost without exception from stable 
homes; the independents much more frequently 
report broken homes and unstable home environ- 
ments, 

Previous theoretical and empirical studies seem 
to converge, though imperfectly, on a picture of 
the overconformist as having less ego strength, 
less ability to tolerate own impulses and to tolerate 
ambiguity, less ability to accept responsibility, less 
self-insight, less spontaneity and productive orig- 
inality, and as having more prejudiced and au- 
thoritarian attitudes, more idealization of parents, 
and greater emphasis on external and socially ap- 
proved values. 

All of these elements gain at least some sub- 
stantiation in the present study of conformity be- 
havior, as objectively measured in our test situa- 
tion. The decisive influence of intelligence in 
resisting conformity pressures is perhaps given 
even fuller weight in the present findings. 


CONFORMITY BEHAVIOR 
IN DIFFERENT POPULATIONS 


Two further studies have been made. The first 
was with 59 college undergraduates, mostly sopho- 
mores. Forty were females, 19 males. An addi- 
tional 40 students served as control subjects. 

Using the same procedures and the same items 
for judgment, the conformity results for this stu- 
dent sample were highly similar to those already 
reported for the adult men. Here again extensive 


CONTEMPORARY RESEARCH IN PERSONALITY 


group pressure effects are found on almost all 
items. And here again there are wide individual 
differences, covering virtually the entire score 
range. 

The male students on the average exhibit just 
about the same level of conformity as do the adult 
men. The female students, on the other hand, 
exhibit significantly higher amounts of conformity 
than the male groups. This greater conformity 
among females is evident across the entire range 
of items tested. Interpretation of this sex dif- 
ference in conformity will require further re- 
search, 

But before male egos swell overly, let me hasten 
to report the results of a third study, just com- 
pleted. Fifty women, all college alumnae in their 
early forties, were tested in the same group pres- 
sure procedure, again as part of a larger assess- 
ment setting, and under the auspices of the Mary 
Conover Mellon Foundation.* As in the previous 
populations, virtually the entire range of individ- 
ual differences in conformity is exhibited by these 
women, Some of them show no effect at all; 
others are influenced on almost all items. But 
the average conformity score for these 50 women 
is significantly lower than that found in the pre- 
vious populations. 

Thus we find our sample of adult women to be 
more independent in judgment than our adult 
men. The interpretation is difficult. The two 
groups differ in many particulars, other than sex. 
The women are highly selected for educational 
and socioeconomic status, are persons active in 
their community affairs, and would be character- 
ized as relatively stable in personality and free of 
psychopathology. The adult men in our profes- 
sional group are less advantageously selected in 
all these respects. Differences in intellectual level 
alone might be sufficient to account for the ob- 
served differences in conformity scores. 


PSYCHOLOGICAL PROCESSES 


Turn now to questions concerning the nature 
of the psychological processes involved in these 
expressions of conformity to group pressure. 
How, for instance, is the situation perceived by 
the individual? The most striking thing is that 
almost never do the individuals under this pressure 
of a false group consensus come to suspect the 
deception practiced upon them. Of the total of 


* The assessment was under the direction of Dr. 
R. Nevitt Sanford. 
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159 persons already tested in the apparatus, and 
questioned immediately afterwards, only a small 
handful expressed doubt of the genuineness of the 
situation. Of these not more than two or three 
really seem to have developed this suspicion while 
in the actual situation. 

Yet all the subjects are acutely aware of the 
sometimes gross discrepancies between their own 
inner judgments and those expressed by the rest 
of the group. How do they account for these 
discrepancies? 

Intensive individual questioning of the subjects 
immediately following the procedure elicits evi- 
dence of two quite different tendencies. First, 
for many persons the discrepancies tend to be 
resolved through self-blame. They express doubt 
of their own accuracy of perception or judgment, 
confessing that they had probably misread or mis- 
perceived the slides. Second, for many other per- 
sons the main tendency is to blame the rest of the 
group, expressing doubt that they had perceived 
or read the slides correctly, This is not a neat 
dichotomy, of course. Most persons express 
something of a mixture of these explanations, 
which is not surprising in view of the fact that 
some slides may tend to favor one interpretation 
of the difficulty and other slides the opposite 
interpretation. 

As might be predicted, there is a substantial 
relationship between conformity score and tend- 
ency to self-blame; or, putting it the other way, 
those who remain relatively independent of the 
group pressure are more likely to blame the dis- 
crepancies on poor judgments by the rest of the 
group. 

But this is by no means a perfect relationship. 
There are many persons who, though retrospec- 
tively expressing doubt of the correctness of the 
group’s judgment, did in fact conform heavily 
while in the situation. And what is even more 
striking is that a substantial number of the sub- 
jects—between 25 and 30 per cent—freely admit 
on later questioning that there were times when 
they responded the way the group did even when 
they thought this not the proper answer. It seems 
evident, therefore, that along with various forms 

_ of cognitive rationalization of the discrepancies, 
there occurred a considerable amount of what 
might be called deliberate conforming, that is, 
choosing to express outward agreement with the 
group consensus even when believing the group 
-to be wrong. 
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Another noteworthy effect was the sense of 
increased psychological distance induced between 
the person himself and the rest of the group. He 
felt himself to be queer or different, or felt the 
group to be quite unlike what he had thought. 
With this went an arousal of considerable anxiety 
in most subjects; for some, manifest anxiety was 
acute. 

The existence of these tensions within and be- 
tween the subjects became dramatically manifest 
when, shortly after the end of the procedure, the 
experimenter confessed the deception he had prac- 
ticed and explained the real situation. There were 
obvious and audible signs of relaxation and relief, 
and a shift from an atmosphere of constraint to 
one of animated discussion. 

This is an appropriate point to comment on 
ethics. No persons when questioned after expla- 
nation of the deception expressed feelings that 
they had been ethically maltreated in the experi- 
ment. The most common reaction was a positive 
one of having engaged in an unusual and signifi- 
cant experience, together with much joking about 
having been taken in. 

Undeniably there are serious ethical issues in- 
volved in the experimental use of such deception 
techniques, especially inasmuch as they appear to 
penetrate rather deeply into the person. My view 
is that such deception methods ethically require 
that great care be taken immediately afterwards 
to explain the situation fully to the subject. 

These remarks on ethics of the method are 
especially pertinent as we move from study of 
judgmental materials which are noncontroversial 
to those which are controversial. In the studies 
of college students and of mature women, many 
new critical items were introduced and subjected 
to the pressure. They were intended to explore 
more deeply the conformity tendencies in matters 
of opinion and attitude. And they were so chosen 
as to pertain to socially important and contro- 
versial issues involving civil liberties, political phi- 
losophy, crime and punishment, ethical values, 
and the like. 

Here are two salient examples. An expression 
of agreement or disagreement was called for on 
the following statement: “Free speech being a 
privilege rather than a right, it is proper for a 
society to suspend free speech whenever it feels 
itself threatened.” Among control subjects, only 
19 per cent express agreement. But among the 
experimental subjects confronted with a unani-' 
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mous group consensus agreeing with the state- 
ment, 58 per cent express agreement. 

Another item was phrased as follows: “Which 
one of the following do you feel is the most im- 
portant problem facing our country today?” And 
these five alternatives were offered: 


Economic recession 
Educational facilities 
Subversive activities 
Mental health 

Crime and corruption 


Among control subjects, only 12 per cent chose 
“Subversive activities” as the most important. 
But when exposed to a spurious group consensus 
which unanimously selected “Subversive activities” 
as the most important, 48 per cent of the experi- 
mental subjects expressed this same choice. 

I think that no one would wish to deny that 
here we have evidence of the operation of power- 
ful conformity influences in the expression of 
opinion on matters of critical social controversy. 


REINFORCEMENT OF CONFORMITY 


There is one final point upon which I should 
like to touch briefly. That is the question of 
whether there are circumstances under which the 
power of the group to influence the judgments of 
the individual may be even more greatly rein- 
forced, and if so, how far such power may extend. 

One method has been tried as part of the study 
of college students. With half of the subjects, a 
further instruction was introduced by the experi- 
menter. They were told that in order to see how 
well they were doing during the procedure, the 
experimenter would inform the group immediately 
after the judgments on each slide what the correct 
answer was. This was to be done, of course, only 
for those slides for which there was a correct 
answer, namely, perceptual judgments, logical 
solutions, vocabulary, etc. No announcement 
would be made after slides having to do with 
opinions and attitudes. 

The experimenter here again deceived the sub- 
jects, for the answers he announced as correct 
were deliberately chosen so as to agree with the 
false group consensus. In short, the external au- 
thority of the experimenter was later added on as 
reinforcement to the group consensus. 

The effect of this so-called “correction” method 
is striking. As the series of judgments goes on, 
these individuals express greater and greater con- 
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formity to the group pressure on slides which are 
of the same character as those for which earlier 
in the series the false group consensus was thus 
reinforced by the false announcement by the ex- 
perimenter. 

But the more critical issue is whether this en- 
hanced power of the group generalizes also to 
judgments of an entirely unrelated sort, namely, 
matters of opinion and attitude, rather than of 
fact. In other words, will the group, through 
having the rightness of its judgment supported by 
the experimenter on matters of perception, logic, 
and the like, thereby come to be regarded by the 
individual as more right, or more to be complied 
with, on entirely extraneous matters, such as social 
issues? 

The answer is absolutely clear. The enhanced 
power of the group does not carry over to increase 
the effective influence on expression of opinions 
and attitudes. The subjects exposed to this “cor- 
rection” method do not exhibit greater conformity 
to group pressure on opinions and attitudes than 
that found in. other subjects. 

This crucial finding throws some light on the 
nature of the psychological processes involved in 
the conformity situation. For it seems to imply 
that conformity behavior under such group pres- 
sure, rather than being sheerly an indiscriminate 
and irrational tendency to defer to the authority 
of the group, has in it important rational elements. 
There is something of a reasonable differentiation 
made by the individual in his manner of reliance 
upon the group. He may be led to accept the 
superiority of the group judgment on matters 
where there is an objective frame of reference 
against which the group can be checked. But he 
does not, thereby, automatically accept the author- 
ity of the group on matters of a less objective sort. 


CONCLUSION 


The social psychologist is concerned with the 
character of conformity, the personologist with 
conformity of character. Between them they raise 
many research questions: the comparative inci- 
dence of conformity tendencies in various popu- 
lations; the influence of group structure and the 
individual's role in the group on the nature and 
amount of conformity behavior; the effects of re- 
ward or punishment for conforming on habits of 
conformity; the genesis and change of conformity 
behavior in the individual personality; the deter- 
minants of extreme anticonformity tendencies. 


SOCIAL, CULTURAL, AND PERSONALITY MEASURES 


Contributing to such questions we have what 
appears to be a powerful new research technique, 
enabling the study of conformity behavior within 
a setting which effectively simulates genuine group 
interaction, yet preserves the essential require- 
ments of objective measurement. 
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SECTION VI 


Perception, 
the Self, 
and 


Personality 


The study of perception has constituted a signifi- 
cant area of psychological study for many years. 
Recently, an increasing number of psychologists 
have been interested in its relationship to the phe- 
nomena studied within the field of personality. 
This has come about in response to the necessity 
of empirically examining many concepts often re- 
lating to unconscious motivation which, when sub- 
jected to systematic analysis, turn out to involve 
assumptions concerning perceptual events and 
processes. These assumptions frequently are im- 
plicit in such questions as: How do unconscious 
forces influence behavior? Can weak, “impercep- 
tible,” stimuli affect performance? What is a 
given individual's self percept? Why do we re- 
spond in a certain way to some persons and not 
others? 

The paper by McConnell, Cutler, and McNeil, 
which begins this section, provides an overview 
of the theoretical and methodological problems 
involved in the assumptions that (1) weak stimuli 
can be unconsciously perceived, and (2) that they 
can exert strong influences over behavior. These 
authors point out the need in evaluating the valid- 
ity of these assumptions for empirical evidence 
concerning the roles of personality and perceptual 
individual differences in the study of response to 
subliminal stimuli. An important point made by 
McConnell, Cutler, and McNeil is that just be- 
cause a concept (e.g., subliminal perception) is 
challenging, interesting and plausible is no reason 
to assume that it is necessarily correct. Rather, 
such a concept should be taken as a starting point 
for scientific study rather than an end product 
itself. 

Just as the psychologist may speak of the per- 
ception of an external object, so it may also be 
possible to refer to the perception of oneself. 
This possibility has stimulated much thought on 
the topic of the self-concept. This concept has 
been defined in many different ways. For exam- 
ple, some writers emphasize the stimulus value for 
the individual of his perception of himself. 
Others have been more concerned with the self 
conceived as the integrator of the individual's ex- 
perience. A problem confronting the study of 
the self-concept is that of the way in which infor- 
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mation concerning the self is obtained. All too 
often information about the self is obtained by 
means of introspective reports from individuals. 
This sort of data, however, can be highly unreli- 
able. One alternative to an introspective ap- 
proach to the self-concept has been provided by 
Hilgard’s proposals dealing with the way in which 
the self-concept can be inferred from objective 
procedures and observations. Hilgard’s paper is 
an important one on two counts. The first is the 
way in which he treats the self as an inference 
from behavior, and the second is his effort to 
integrate the inferred self with aspects of psycho- 
analytic theory. 

If attitude towards one’s self is in some way 
incorporated into the self-concept, and if these 
self attitudes significantly influence behavior, then, 
one reasonable possibility would be to investigate 
to what extent these attitudes interact with atti- 
tudes of others in determining social behavior. 
Newcomb's article, in Section Five, has presented 
an examination of the relationship of perceived 
similarity between individuals to the degree of 
interpersonal attraction between them. Many in- 
vestigators have considered the phenomenon of 
interpersonal attraction in terms of such concepts 
as empathy, social perception, and social sensi- 
tivity. Frequently these concepts carry the pos- 
sible implication that such attributes are trait- 
like in nature. In their article, Gage and Cron- 
bach demonstrate the complexity of the phenom- 
ena of interpersonal perception and prediction and 
make a good case for the argument that they may 
very well not be general personality traits. Their 
article clearly shows the value of careful method- 
ology and analysis of the experimental situation 
in evaluating theoretical assumptions. 
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SUBLIMINAL STIMULATION: 
AN OVERVIEW * 


JAMES V. MCCONNELL, 
RICHARD L. CUTLER, 
AND ELTON B. MCNEIL 


Seldom has anything in psychology caused such 
an immediate and widespread stir as the recent 
claim that the presentation of certain stimuli be- 
low the level of conscious awareness can influence 
people’s behavior in a significant way. The con- 
troversy was precipitated primarily by a commer- 
cial firm which claimed that the subliminal pres- 
entation of the words “Eat Popcorn” and “Drink 
Coca-Cola” fantastically stimulated the respective 
sales of these products among the motion picture 
audiences who received the stimulation. Despite 
the fact that detailed reports of the experiment 
have not been made directly available in any 
published form, this technique was seized upon 
as the newest of the “new look” promises of the 
application of psychology to advertising. While 
such claims and demonstrations will be considered 
in greater detail below, it is important to note here 
that they have given rise to a series of charges 
and countercharges, the effects of which have 
reached the United States Congress and the Fed- 
eral Communications Commission (7, 117). 

Rarely does a day pass without a statement in 
the public press relating to the Utopian promise 
or the 1984 threat of the technique (8, 17, 29, 37, 
42, 45, 118, 132). Since the process of choosing 
up sides promises to continue unabated, it appears 
wise to provide the potential combatants with a 
more factual basis for arriving at their positions 
than presently seems available. Meanwhile, the 
present writers have cautiously sought to avoid 
aligning themselves behind either of the barri- 
cades. 

Obviously, the notion that one may influence 
the behavior of another individual without the 
individual’s knowing about it is a fascinating one. 
It is of extreme interest, not only to psychologists 
and advertisers, but also to politicians, psychia- 
trists, passionate young men, and others, whose 
motives would be considered more or less sacred 


* Reprinted by permission from The American Psy- 
chologist, May, 1958, Vol. 13, No. 5, 229-242. 
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by the larger society. Equally obvious is the need 
for a clarification of the issues surrounding the 
application of subliminal perception. This clari- 
fication must involve the assessment of available 
scientific evidence, the answering of a series of 
technical questions, and the examination of what, 
if any, levels of behavior may indeed be influ- 
enced. Finally, a series of extremely complex 
ethical issues needs to be explored. It is the pur- 
pose of the present paper to undertake this task, 
in the hope of providing information upon which 
possible decisions involving its application may 
be based. 


RECENT HISTORY OF THE TECHNIQUE 


The custom of providing a chronological review 
of the literature will be violated in this paper, 
inasmuch as three separate threads of investiga- 
tion seem worth tracing: (a) the recent dem- 
onstrations by advertisers which first aroused 
large-scale public interest in subliminal percep- 
tion, (b) systematic research by psychologists re- 
lating directly to the influencing of behavior with- 
out the individual’s awareness that he is being 
influenced, and (c) psychological research con- 
cerned primarily with the influence of inner states 
of the organism upon the threshold for conscious 
recognition of certain stimuli. 

Recent Advertising Demonstrations. —While the 
advertising -possibilities of subliminal stimulation 
were recognized by Hollingworth (59) as early 
as 1913, the intensive work in its application to 
this area has been carried out within the past 
two years. In 1956, BBC-TV, in conjunction with 
one of its regular broadcasts, transmitted the mes- 
sage “Pirie Breaks World Record” at a speed 
assumed to be subliminal (85). At the conclu- 
sion of the regular program, viewers were asked 
to report whether they had noticed “anything un- 
usual” about the program. While no reliable 
statistical data are available, it seems possible that 
those few viewers responding to the message pos- 
sessed sufficiently low thresholds so that for them 
the message was supraliminal. 

A demonstration by the commercial enterprise 
which has been most vocal in its claims for the 
advertising promise of the technique consisted of 
projecting, during alternate periods, the words 
“Eat Popcorn” and “Drink Coca-Cola” during 
the regular presentation of a motion picture pro- 
gram. As a result of this stimulation, reports 
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contend,’ popcorn sales rose more than 50% and 
Coca-Cola sales 18%, as compared to a “previous 
period.” Despite the likelihood of serious meth- 
odological and technical defects (exposure time 
was reported as 1/3,000 sec., far faster than any 
previously reported stimulation), this demonstra- 
tion has been the one which has caused the most 
stir in both the fields of advertising and psychol- 
ogy. There were no reports, however, of even the 
most rudimentary scientific precautions, such a: 
adequate controls, provision for replication, etc., 
which leaves the skeptical scientist in a poor posi- 
tion to make any judgment about the validity of 
the study. 

In a later demonstration for the press. technical 
difficulties permitted the viewers to become con- 
sciously aware of the fact that they were being 
stimulated. Although described as a purposeful 
and prearranged part of the demonstration, it left 
many of the reporters present unconvinced that 
the technical difficulties inherent in the technique 
have been surmounted. 

The FCC, turning its attention to the problem, 
has reported that one TV station (WTWO, 
Bangor, Maine) has experimented with the trans- 
mission of public service announcements at sub- 
liminal levels, with “negative results” (117). 

The uncontrolled and unsystematic nature of 
the demonstrations reported above makes very 
difficult the task of reaching a trustworthy con- 
clusion about the effectiveness of subliminal stim- 
ulation in advertising. Whether the technique 
represents a promising means of communicating 
with the individual at a level of his unconscious- 
ness or whether it reflects only the hyperenthusi- 
asm of an entrepreneurial group remain unan- 
swered questions. 

Research on Behavior Without Awareness.—In 
the hope of providing a more substantial founda- 
tion upon which to base judgments of the validity 
of advertising claims for subliminal stimulation, 
a systematic review of relevant scientific work was 
undertaken. While we believe that our review 
was comprehensive, we have decided not to pro- 
vide an extensive critical discussion of the various 
studies, choosing instead to present summative 


ı The essential facts of this study have not been 
reported in any journal. The discussion of this ex- 
periment and the findings reported by the commercial 
enterprise responsible for the study is based on reports 
in several general news accounts appearing in the 
popular press (7, 8, 16, 17, etc.). 


224 


statements and conclusions based upon what seems 
to be sufficient evidence and consensus in the lit- 
erature.” 

The work of experimental psychologists in sub- 
liminal stimulation dates from Suslowa (779) in 
1863, as reported by Baker (5). Suslowa’s ex- 
periments concerned the effect of electrical stimu- 
lation upon subjects’ ability to make two-point 
threshold discriminations. He found that, even 
when the intensity of the electrical stimulation 
was so low that the subjects were not aware of its 
presence, their ability to discriminate one- from 
two-point stimulation was somewhat reduced. 

In 1884, Peirce and Jastrow (94) were able to 
show that subjects could discriminate differences 
between weights significantly better than chance 
would allow, even though the differences were so 
small they had no confidence whatsoever in their 
judgments. 

Numerous experimenters have relied upon this 
criterion of “zero confidence” to establish that 
discrimination of stimuli presented below the 
level of conscious awareness is possible. For 
example, Sidis (707) showed that subjects could 
reliably distinguish letters from numbers, even 
when the stimuli were presented at such a dis- 
tance from them that the subjects thought they 
were relying on pure guesswork for their judg- 
ments. 

In what was essentially a replication of Sidis’ 
research, Stroh, Shaw, and Washburn (116) 
found evidence to support his conclusions. They 
found similar results when auditory stimuli 
(whispers) were presented at a distance such 
that the subjects were not consciously aware 
that they were hearing anything. 

Several experiments have provided further sup- 
port for Peirce and Jastrow’s initial conclusions 
(44, 127). Baker (5) found subjects able to 
discriminate diagonal from vertical crossed lines, 
and a dot-dash from a dash-dot auditory pattern. 
Miller (88) presented five geometric figures at 
four different levels of intensity below the thresh- 
old and found that, while subjects could discrim- 
inate which was being presented a significant pro- 
portion of the time, their ability to discriminate 
was reduced as the intensity of stimulation was 
further reduced. More recently, a series of 


2 The reader who wishes a more complete technical 
critique of studies in the field is referred to reviews 
by Adams (J), Collier (27), Coover (28), Lazarus 
and McCleary (76), and Miller (90). 
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studies by Blackwell (ZZ) has shown that sub- 
jects can reliably identify during which of four 
time periods a subliminal spot of light is pre- 
sented upon a homogeneous field. Blackwell, 
however, stresses that reliability of discrimination 
decreases as the intensity of the stimulus is further 
lowered. Several other supporting studies are 
available (28, 97, 130) which show essentially 
the same results, namely, that even when sub- 
jects have zero confidence in their judgments, 
they can discriminate reliably (though not per- 
fectly) between stimuli. 

In his review, Adams (J) points out certain 
general weaknesses inherent in studies of this 
type, but agrees with the present authors that 
discrimination can occur under certain circum- 
stances, However, it is interesting to note that, 
in nearly all studies reporting relevant data, the 
reliability of the subjects’ judgments increases 
directly with the intensity of the stimuli, If a 
valid extrapolation can be drawn from this find- 
ing, it would be that accuracy of perception in- 
creases as the stimulation approaches a supra- 
liminal level. 

A second series of studies has involved present- 
ing subjects with variations of the Mueller-Lyer 
illusion, in which the angular lines have differed, 
subliminally, in hue or brightness from the back- 
ground. The first of these studies, reported by 
Dunlap in 1909 (36), gave clear evidence that 
the subjects were influenced in their judgments 
of line length, even though they could not “see” 
the angular lines. Several replications of this 
study have been carried out, and while at least 
three have found partial support for Dunlap’s 
conclusions (14, 59, 86), others have failed to 
find the phenomenon (123). In another experi- 
ment conducted by Sidis in 1898 (107), subjects 
asked to fixate on a number series in the center 
of a card, and then asked to pick a number from 
this series, systematically chose that number 
which was written in the periphery of the card, 
even though they were not consciously aware of 
its presence. Coover (28) in 1917 showed es- 
sentially the same results by asking subjects to 
pick a number at random while they were fixat- 
ing on a letter in the upper right portion of a 
card. He found that subjects tended to pick the 
number printed in the lower left of the card, even 
though they did not usually know it was there. 
In similar experiments, Collier (27) and Perky 
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(95) showed that subjects could be made to pro- 
duce drawings, even though they were not aware 
that they were being influenced in their actions. 
While these studies are not unequivocal in their 
findings, nor generally rigorous in their method- 
ology, they too seem to support the contention 
that behavior of a sort can be influenced by 
subliminal means. However, they require cau- 
tious interpretation, since the degree of the sub- 
ject’s attention to the stimuli seems clearly to be 
a factor. Further, as contrasted to those studies 
where the subject is actually aware in advance of 
at least the general nature of the stimulation, 
these studies reveal a somewhat less pronounced 
effect of subliminal stimulation upon the sub- 
ject’s behavior. 

While the studies reported above seem to indi- 
cate that discrimination without awareness may 
occur, it may reasonably be asked whether stim- 
ulation below the level of conscious awareness 
can produce any but the most simple modifica- 
tions in behavior. A series of studies (24, 26, 
73, 109), beginning with Newhall and Sears in 
1933 (92), have attempted to show that it is 
possible to condition ` subjects to subliminal 
stimuli. Newhall and Sears found it possible 
to establish a weak and unstable conditioned re- 
sponse -to: light presented subliminally, when the 
light had been previously paired with shock. 
Baker (6) in 1938 reported the successful con- 
ditioning of the pupillary reflex. to a subliminal 
auditory stimulus, but later experimenters have 
failed to replicate his results (57, 128). Ina 
now classic experiment, McCleary. and. Lazarus 
(79) found that nonsense Syllables which had 
previously been associated with shock produced 
a greater psychogalvanic reflex when presented 
tachistoscopically at subliminal speeds than did 
nonshock syllables. Deiter (34) confirmed the 
McCleary and Lazarus findings and showed fur- 
ther that, when verbal instructions were substi- 
tuted for the shock, no such differences were 
produced. Bach and Klein (4) have recently 
reported that they were able to influence sub- 
jects’ judgments of whether the line drawing of 
a face (essentially neutral in its emotional ex- 
pression) was “angry” or “happy” by projecting 
the appropriate words at subliminal speeds upon 
the drawing. 

A series of related studies (58, 65, 89, 99, 105, 
121, 122) have shown that, even when the sub- 
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ject is not aware that any cue is being given, cer- 
tain responses can be learned or strengthened 
during the experimental process, For example, 
Cohen, Kalish, Thurston, and Cohen (25) showed 
that, when the experimenter said “right” to any 
sentence which the subject started with “I” or 
“We,” the number of such sentences increased 
significantly. Klein (69) was able to produce 
both conditioning and extinction without aware- 
ness, using the Cohen et al. technique. ‘ 

Several experimenters have used subliminal or 
“unnoticed” reward-punishment techniques to 
modify subjects’ responses in a variety of situa- 
tions, including free or chained association tasks, 
performance on personality tests, and interview 
elicited conversation (35, 41, 50, 56, 72, 78, 93, 
120, 125, 126). Typical is the work of Green- 
spoon (48), who reinforced the use of plural 
nouns by saying “mm-humm” after each plural 
mentioned by the subject. He found that, even 
though none of his subjects could verbalize the 
relationship between their response and his rein- 
forcement, their use of plural nouns doubled. 
Sidowski (108) demonstrated essentially the same 
thing using a light, of which the subject was 
only peripherally aware, as a reinforcer for the 
use of plural words. Weiss (129), however, 
failed to find any increase in the frequency of 
“living things” responses, using a right-wrong re- 
inforcement to free associations by the subjects. 

This evidence suggests that subjects may either 
(a) “learn” certain subliminally presented stimuli 
or (b) make use of subliminal reinforcers either 
to learn or strengthen a: previously learned re- 
sponse. Again, the critical observations of Adams 
(1) and the introduction of other possible ex- 
planations by Bricker and Chapanis (15) make 
necessary a cautious interpretation of these re- 
sults. 

Effects of Inner States Upon Thresholds.— 
Whatever the possibility that subliminal stimula- 
tion may significantly alter behavior, there is ex- 
cellent evidence that certain inner states of the 
organism, as well as externally induced condi- 
tions, may significantly alter the recognition 
threshold of the individual. This, of course, has 
important implications for the susceptibility of 
the individual to the effects of subliminal stimula- 
tion. It is well known that physiological factors, 
such as fatigue, visual acuity, or satiation, may 
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change the threshold of an individual for various 
kinds of stimuli. 

Recent evidence has accumulated to show that, 
in addition to these physiological factors, certain 
“psychological states,” such as psychological need, 
value, conflict, and defense, may also significantly 
influence thresholds, as well as other aspects of 
the perceptual process. Early work in this area 
is reported by Sanford (102, 103) who showed 
that subjects who had been deprived of food were 
more prone to produce “food-relevant” responses 
to a series of ambiguous stimuli. McClelland 
and Atkinson (80) showed that levels of the 
hunger drive were systematically related to the 
ease with which food responses were made when 
no words were presented on the screen. 

While a complete review of the experimental 
work on “perceptual defense” and “selective 
vigilance” would take us too far afield, it seems 
wise to indicate, by example, some of the inner 
state factors which allegedly produce variations 
in recognition threshold. Bruner and Postman 
(19, 20, 21) and Bruner and Goodman (78) were 
able to show that such factors as symbolic value, 
need, tension and tension release, and emotional 
selectivity were important in the perceptual proc- 
ess. Ansbacher (3) had earlier demonstrated 
that the perception of numerosity was signifi- 
cantly affected by the monetary value of the 
stimuli. Rees and Israel (707) called attention 
to the fact that the mental set of the organism 
was an important factor in the perceptual process. 
Beams and Thompson (9) showed that emo- 
tional factors were important determiners of the 
perception of the magnitude of need-relevant ob- 
jects. Other studies bearing upon the issue of 
inner state determiners of perception are reported 
by Carter and Schooler (23), Cowen and Beier 
(31, 32), and Levine, Chein, and Murphy (77). 

More specifically related to the issue of altered 
recognition thresholds is a study by McGinnies 
(82) in which he demonstrated that emotionally 
toned words had generally higher thresholds than 
neutral words. Blum (73) has shown that sub- 
jects tend to be less likely to choose conflict- 
relevant stimuli from a group presented at sub- 
liminal speeds than to choose neutral stimuli. 
Lazarus, Ericksen, and Fonda (75) have shown 
that personality factors are at least in part de- 
terminers of the recognition threshold for classes 
of auditory stimuli. Reece (100) showed that 
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the association of shock with certain stimuli had 
the effect of raising the recognition threshold for 
those stimuli. 

While many writers have contended that the 
variations in threshold can be accounted for more 
parsimoniously than by introducing “motiva- 
tional” factors such as need and value (60, 6/, 
111), and while the issue of the degree to which 
need states influence perception is still unresolved 
(22, 39, 40, 62, 74, 83), it is apparent that the 
recognition threshold is not a simple matter of 
intensity nor speed of presentation. Recent work 
by Postman and others (47, 96, 98), which has 
sought to illuminate the prerecognition processes 
operating to produce the apparent changes in 
threshold, does not alter the fact that individual 
differences in the perceptual process must be 
taken into account in any further work on the 
effects of subliminal stimulation. 


UNANSWERED METHODOLOGICAL QUESTIONS 


Having now concluded that, under certain con- 
ditions, the phenomenon of subliminal perception 
does occur, we turn our attention next to the 
many unanswered questions which this conclu- 
sion raises. For example, what kinds of behavior 
can be influenced by subliminal stimulation? 
What types of stimuli operate best at subthresh- 
old intensities? Do all subliminal stimuli operate 
at the same “level of unconsciousness,” or do 
different stimuli (or modes of stimulation) af- 
fect different levels of unconsciousness? What 
characteristics of the perceiver help determine 
the effectiveness of subliminal stimulation? All 
of these questions, as well as many others of a 
technological nature, will be discussed in the 
ensuing paragraphs. 

A few words of caution concerning the word 
“subliminal” seem in order, however. It must be 
remembered that the psychological limen is a 
statistical concept, a fact overlooked by far too 
many current textbook writers. The common 
definition of the limen is “that stimulus value 
which gives a response exactly half the time” 
(44, p. 111). One of the difficulties involved in 
analyzing the many studies on subliminal per- 
ception is the fact that many experimenters have 
assumed that, because the stimuli which they 
employed were below the statistical limen for a 
given subject, the stimuli were therefore never 
consciously perceivable by the subject. This is, 
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of course, not true. Stimuli slightly below the 
statistical limen might well be consciously per- 
ceivable as much as 49% of the time. Not only 
this, but thresholds vary from moment to mo- 
ment, as well as from day to day. All this is not 
to deny that stimuli which are so weak that they 
are never consciously reportable under any cir- 
cumstances may not indeed influence behavior. 
We simply wish to make the point that the range 
of stimulus intensities which are in fact “sub- 
liminal” may be smaller than many experimenters 
in the past have assumed. It has been commonly 
assumed that the several methods of producing 
subliminal stimuli, i.e., reducing intensity, dura- 
tion, size, or clarity, are logically and methodo- 
logically equivalent. While this may be true, it 
remains to be demonstrated conclusively. 

Types of Behavior Influenced by Subliminal 
Stimulation—One of the first questions that 
springs to mind concerns the types of response 
which can be elicited with subliminal stimulation. 
Let us assume for the moment that the below- 
threshold advertisements used in commercial dem- 
onstrations were the sole cause of increased pop- 
corn buying among the movie audiences subjected 
to the ads. How did this come about? Did the 
stimulus “Eat Popcorn” elicit an already estab- 
lished response in some members of the audience? 
Or did the frequent repetitions of the stimulus 
message cause a shift in attitude towards popcorn 
eating which eventually resulted in the purchase 
of popcorn at the first opportunity the audience 
had? Did the ads merely raise an already exist- 
ing, presumably learned, but weak need for pop- 
corn to an above the action-threshold level, or 
did the ads actually create a need for popcorn 
where no need had existed beforehand? Did 
members of the audience rise like automatons 
during the course of the movie and thus miss part 
of the feature in order to satisfy a sudden craving 
for popcorn or in order to respond to a suddenly 
evoked stimulus-response connection? Or did 
they wait until a “rest period” to do their purchas- 
ing? How many patrons bought popcorn only 
after they had seen the film and were heading 
home? How many people purchased popcorn on 
their way in to see the next movie they attended? 
How many of those who purchased popcorn did 
so for the first time in their lives, or for the first 
time in recent memory? What if the message 
presented had been “Buy Christmas Seals,” which 
are available only in one season? How many 


people failed to buy popcorn at the theater, but 
purchased it subsequently at the local super- 
market? 

Unfortunately, these pertinent questions have 
yet to be answered. Let us tentatively accept this 
demonstration that impulse buying of inexpensive 
items such as popcorn and Coca-Cola can be in- 
fluenced by subliminal advertising, without yet 
knowing what the mechanism involved is. It re- 
mains to be demonstrated, however, that such ads 
could make a person of limited means wreck him- 
self financially by purchasing a Cadillac merely 
because the ads told him to do so. Nor do we 
know if deep-seated, strongly emotional attitudes 
or long established behavior patterns can be shifted 
one way or another as a result of subliminal stim- 
ulation. The answers to these questions must 
come from further experimentation. 

As we have already seen, people can make use 
of subthreshold stimuli in making difficult per- 
ceptual judgments in situations where they are re- 
quired to call up images of various objects (95) 
and in situations where they are asked to “read 
the experimenter’s mind” (88). Kennedy (68) 
believes that some extrasensory-perception (ESP) 
experimenters may have obtained positive results 
because the “senders” unconsciously transmitted 
slight auditory and visual cues to their “receivers,” 
and offers many experimental findings to back up 
his belief. Kennedy’s studies also point up the 
difficult dilemma faced by people who object to 
subliminal stimulation as being an immoral or 
illegal attempt to influence other people. All of 
us, apparently, are constantly attempting to influ- 
ence the people around us by means of sounds 
and movements we are unconscious of making. 
Correspondingly, all of us make some unconscious 
use of the cues presented to us by the people 
around us. 

It also seems fairly clear that learning can take 
place when the stimuli to which the organism must 
respond are presented subliminally. Hankin (51) 
learned to predict changes in the flight of birds 
by utilizing wing-tip adjustments which were too 
slight to be consciously (reportably) noticeable. 
As we stated previously, Baker (6) obtained a 
conditioned pupillary response to subliminal audi- 
tory stimuli, although other investigators failed to 
replicate his findings. Miller (89) had subjects 
look at a mirror while trying to guess geometrical 
forms in an ESP-type experiment. Stimuli far 
below the statistical limen were projected on a 
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mirror from behind. When the subjects were 
rewarded by praise for correct guesses and pun- 
ished by electric shock for wrong guesses, learn- 
ing took place. It is interesting to note that 
neither punishment alone nor reward alone was 
sufficient to produce learning. 

Whether different types of learning than those 
reported above can take place using subliminal 
stimulation, and indeed how broad a range of 
human behavior can be influenced in any way 
whatsoever by subliminal stimulation, are ques- 
tions which remain unanswered. 

Levels of Unconsciousness Affected by Sublim- 
inal Stimulation.®—We must now differentiate be- 
tween stimuli which a subject cannot bring to 
awareness under any conditions (completely sub- 
liminal stimuli) and those stimuli of which he is 
merely not aware at the moment but could be 
made aware of should his set be changed. At any' 
given moment, a vast conflux of stimuli impinges 
upon a subject’s receptors. Few of the sensations 
arising from this stimulation ever enter the focus 
of attention, As Dallenbach was fond of remind- 
ing his Freshman classes: “Until I mentioned it, 
you were quite unaware that your shoes are full 
of feet.” A great many experimenters have dem- 
onstrated that subjects could make use of stimuli 
well above the threshold of awareness but which 
could not be consciously reported on. Thus in 
one phase of her experiment, Perky (95) raised 
the intensity of the visual stimuli she was using 
to such a level that other psychologists who had 
not participated in the study apparently refused 
to believe that the subjects had not been aware 
of the stimuli. Perky’s subjects, however, operat- 
ing under a set to call up “images” of the stimuli 
presented, did not notice even relatively intense 
stimuli. Correspondingly, Newhall and Dodge 
(91) presented visual stimuli first at below-thresh- 
old intensities, then increased the intensities so 
slowly that the subjects were not aware of them 
even when the stimuli were well above threshold. 
When the stimuli were turned off suddenly, how- 
ever, the subjects experienced afterimages. Thus 
certain stimuli may be well above threshold and 
yet be “subliminal” in the sense that they cannot 
be reported on under certain experimental con- 
ditions. 

There are other levels of “unconsciousness” 


8 For an excellent review of the many meanings of 
the word “unconsciousness,” readers are referred to 
Miller’s book of the same name (90). 
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which are deserving of our attention, however. 
Much work has been done at the animal level in 
which conditioning has been attempted upon ani- 
mals With various parts of the brain removed (33, 
43). The same is true of animals under various 
types of anesthesia (706, 115). Miller, in sum- 
marizing the experimental data dealing with con- 
ditioning and consciousness, concludes; 

(a) That conditioning can take place in other 
parts of the nervous system than the cortex—even 
in the spinal cord; 

(b) That, if conditioned responses are evi- 
dences of consciousness, then consciousness is not 
mediated solely by the cortex; 

(c) That it may be possible to develop condi- 
tioning . . . at more than one level of the nerv- 
ous system at the same time; 

(d) And that . . . animals are conditionable 
even when anesthetized (90, p. 100). 

The nervous system has many levels of ana- 
tomical integration. Should we be surprised to 
discover that incoming stimuli may have an effect 
on a lower level and not on a higher and that 
under certain conditions this effect can later be 
demonstrated in terms of behavioral changes? 
We shall not'be able to speak clearly of the effects 
of subliminal stimulation upon the various “levels 
of unconsciousness” until we have some better 
method of specifying exactly what these levels are 
and by what parts of the nervous system they are 
mediated. Experimentation is badly needed in 
this area. 

Technological Problems Involved in Stimulating 
Subjects Subliminally —The paucity of data pre- 
sented by those dealing with subliminal perception 
on a commercial basis, as well as the equivocal 
nature of their results, suggests that there are 
many technological problems yet to be solved by 
these and other investigators. For example, dur- 
ing a two-hour movie (or a one-hour television 
show), how many times should the stimulus be 
repeated to make sure that the “message” gets 
across to the largest possible percentage of the 
audience? Should the stimulus be repeated every 
second, every five seconds, only once a minute? 
Is the effect cumulative, or is one presentation 
really enough? Is there a satiation effect, such 
that the audience becomes “unconsciously tired” 
of the stimulation, and “unconsciously blocks” 
the incoming subliminal sensations? Should the 
stimuli be presented “between frames” of the 
movie (that is, when the shutter of the film pro- 
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jector is closed and the screen momentarily blank 
as it is 24 times each second), or should the mes- 
sage be presented only when the screen already 
has a picture on it? How close to the threshold 
(statistical or otherwise) should the stimuli be? 
How many words long can the message be? If 
the message must be short, could successive stimu- 
lations present sequential parts of a longer adver- 
tisement? How much of the screen should the 
stimuli fill? Should the stimuli be presented only 
during “happier” moments in the film, in order 
to gain positive affect? Does any affect transfer 
at all from the film to the ad? Should one use 
pictures, or are words best? Must the words be 
familiar ones? And what about subliminal audi- 
tory, cutaneous, and olfactory stimulation? 

As we have stated before, there has been so 
much talk and so little experimentation, and much 
of what experimentation has been done is so in- 
adequately reported, that we can merely hazard 
guesses based on related but perhaps not always 
applicable studies. 

To begin with, we can state with some assur- 
ance that, the closer to the threshold of awareness 
the stimuli are, the more effect they are likely 
to have. Study after study has reported increased 
effectiveness with increased intensity of stimula- 
tion (5, 14, 88, 97, 104). The main difficulty 
seems to be that thresholds vary so much from 
subject to subject (212), and from day to day 
(114), that what is subliminal but effective for 
one person is likely to be subliminal but ineffec- 
tive for a second, and supraliminal for a third. 
As is generally the case, anyone who wishes to 
use the technique of subliminal stimulation must 
first experiment upon the specific group of people 
whom he wishes to influence before he can decide 
what intensity levels will be most efficacious. 

Somewhat the same conclusion holds for the 
question of how many times the stimuli should be 
presented. While under some conditions sub- 
liminal stimuli which did not influence behavior 
when presented only once seemed to “summate” 
when presented many times (10, 66), Bricker 
and Chapanis (15) found that one presentation 
of a stimulus slightly below the (statistical) limen 
was enough to increase the likelihood of its being 
recognized on subsequent trials. We interpret 
this to mean that too many presentations may 
well raise the “subliminal” stimuli above the limen 
of awareness if the stimuli themselves are not 
carefully chosen. 
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As for the physical properties of the message 
itself, we can but guess what the relevant issues 
are. Both verbal and pictorial presentations ap- 
parently are effective in the visual modality, but 
no one has tested the relative effectiveness of 
these two types of stimulation. Quite possibly 
subsequent experimentation will show that words 
are best for some situations (such as direct com- 
mands), while pictures are best for others.* It 
can be stated unequivocally, however, that ad- 
vertisers should look to their basic English when 
writing their subliminal commercials. Several 
studies have shown that, the more familiar a sub- 
ject is with the stimulus he is to perceive, the 
more readily he perceives it (22, 54, 63, 110). 
We interpret these studies to mean that unfamiliar 
stimuli may be ineffective when presented sub- 
liminally, even though familiar messages may 
“get through.” 

The exact length the message should be, its 
composition, and the background in which it 
should be presented are variables upon which no 
work has been done and about which no conclu- 
sions can presently be drawn. Suffice it to say, 
however, that a message which would be short 
enough to be perceived by one person might be 
too long for another person to perceive under 
any conditions. 

Which modalities are most useful for subliminal 
stimulation? While most of the work has been 
done on the visual modality, Vanderplas and 
Blake (724) and Kurland (7/) have found sub- 
threshold auditory stimuli to be effective, and 
earlier in this paper we have reported similar 
studies with cutaneous stimulation. Advertisers 
who wish to “sneak up on” their patrons by 
presenting subliminal stimuli in one modality 
while the patrons are attending to supraliminal 
stimuli from another modality are probably 
doomed to failure, however. Collier (27) pre- 
sented subliminal geometric forms simultaneously 
to both the visual and the cutaneous modalities 
and found little, if any, lowering of thresholds. 
Correspondingly, it should be remembered that 
Hernandez-Peon et al. (55) found that some part 
of the nervous system acts as a kind of gating 
mechanism, and when an organism is attending 


4 Perhaps much of the work on sensory precondi- 
tioning is applicable here. When Ellson (38) pre- 
sented his subjects with both a light and a buzzer for 
many trials, then presented the light alone, subjects 
“heard” the buzzer too. 
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strongly to one modality, the other modalities 
are probably “shut off” to most incoming stimuli. 

Even if experimenters succeed in finding an- 
swers to many of the questions raised above con- 
cerning the physical characteristics of the stimuli 
to be employed, it is quite probable that they will 
have succeeded in discovering the source of only 
a small part of the variance operant in subliminal 
perception. For, as always, the major source of 
variance will come from the perceiver himself. 

Characteristics of the Perceiver Which Affect 
Subliminal Perception—The following section of 
this paper might well be considered a plea for the 
recognition that individual differences exist and 
that they must be taken into account by anyone 
who wishes to deal with individuals. We know 
next to nothing about the relationships between 
such factors as age, sex, social class, etc., and 
subliminal perception. Perhaps only one study 
is relevant: Perky (95) found that children were 
as much influenced by subthreshold visual stim- 
ulation as were naive adults. It is quite likely 
that many differences in the perception of sub- 
liminal stimuli do exist between individuals of 
differing classes, ages, and sexes. As always, only 
experimentation can determine what these differ- 
ences are. 

We do have some idea, however, of how what 
might be called “personality factors” influence 
subliminal perception. First and foremost, there 
seems little doubt but that a high need state affects 
perception. Gilchrist and Nesberg (46) found 
that, the greater the need state, the more their 
subjects tended to overestimate the brightness of 
objects relevent to that need. It should be noted 
that they were dealing with difference limens, not 
absolute limens, but other studies to be quoted 
later show the same effect for absolute limens. 
It should be noted also that Gilchrist and Nesberg 
apparently overlooked evidence in their own data 
that a strong need affects judgments of non-need- 
related objects in the same direction (but not as 
much) as it does need-related objects. Wispe 
and Drambarean, dealing with visual duration 
thresholds, concluded that “need-related words 
were recognized more rapidly as need increased” 
(131, p. 31). McClelland and Lieberman (87) 
found that subjects with high need achievement 
scores had lower visual thresholds for “success” 
words than did subjects not scoring as high on 
need achievement. Do ail of these findings mean 


that subliminal ads will work only when some 
fairly strong need (of any kind) is present in 
the viewers? Only experimentation can answer 
this question. 

What about abnormalities of personality? 
What effect do they have? Kurland (7/) tested 
auditory recognition thresholds using emotional 
and neutral words. He found that hospitalized 
neurotics perceived the emotional words at signifi- 
cantly lower thresholds than did a group of nor- 
mal subjects. Does this mean that neurotics are 
more likely to respond to low-intensity subliminal 
commands than normals? Should advertisers take 
a “neurotic inventory” of their audiences? 

A more pertinent problem is posed by the find- 
ings of Krech and Calvin (70). Using a Wechs- 
ler Vocabulary Score of 30.5 as their cutting 
point, they found that almost all college students 
above this score showed better visual discrimina- 
tions of patterns presented at close to liminal 
values than did almost all students scoring below 
the cutting point. Does this mean that the higher 
the IQ, the better the subliminal perception? 
What is the relationship between the value of 
the absolute limen and intelligence? Will adver- 
tisers have to present their messages at such high 
intensities (in order that the “average man” might 
perceive the message) that the more intelligent 
members of the audience will be consciously 
aware of the advertising? 

One further fascinating problem is posed by 
Huntley’s work (64). He surreptitiously ob- 
tained photographs of the hands and profiles of 
his subjects, as well as handwriting samples and 
recordings of their voices. Six months later each 
subject was presented with the whole series of 
samples, among which were his own. Each sub- 
ject was asked to make preference ratings of the 
samples. Huntley reports evidence of a significant 
tendency for subjects to prefer their own forms 
of expression above all others, even though in 
most cases they were totally unaware that the 
samples were their own and even though many 
subjects were unable to identify their own samples 
when told they were included in the series. If 
an advertiser is making a direct appeal to one 
specific individual, it would seem then that he 
should make use of the photographs and record- 
ings of that individual’s behavior as the subliminal 
stimuli. If an advertiser is making an appeal 
to a more general audience, however, it might be 
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that he would find the use of pictures and record- 
ings of Hollywood stars, etc., more efficacious 
than mere line drawings, printed messages, and 
unknown voices. 

Nor can the advertiser afford to overlook the 
effects of set and attention. Miller (88), Perky 
(95), and Blake and Vanderplas (12), among 
others, discovered that giving the subject the 
proper set lowered the recognition threshold 
greatly. In fact, in many cases the stimulus in- 
tensity which was subliminal but effective for 
sophisticated subjects was far too subliminal to 
have much, if any, effect upon naive subjects. 
Thus advertisers might do well to tell their audi- 
ences that subliminal messages were being pre- 
sented to them, in order to bring all members of 
that audience closer to a uniform threshold. Does 
this not, however, vitiate some of the effect of 
subliminal advertising? 

As for attentional effects, we have presented 
evidence earlier (46) that strong needs seem to 
have an “altering” effect upon the organism, 
lowering recognition thresholds for all stimuli, 
not just need-related stimuli. In addition to this, 
two studies by Hartmann (52, 53), as well as 
two by Spencer (113, 114), lead us to the belief 
that subliminal stimuli might best be presented 
when either the television or movie screen was 
blank of other pictures. Perhaps, then, subliminal 
commercials in movie houses should be shown 
between features; while on television the com- 
mercials should consist of an appropriate period 
of apparent “visual silence,” during which the 
audience would not be aware of the subliminal 
stimulation presented, but might react to it later. 

One fact emerges from all of the above. Any- 
one who wishes to utilize subliminal stimulation 
for commercial or other purposes can be likened 
to a stranger entering into a misty, confused coun- 
tryside where there are but few landmarks. Be- 
fore this technique is used in the market place, 
if it is to be used at all, a tremendous amount 
of research should be done, and by competent 
experimenters. 


THE ETHICS OF SUBLIMINAL INFLUENCE 


From its beginnings as a purely academic off- 
shoot of philosophy, psychology has, with ever 
increasing momentum, grown in the public per- 
ception as a practical and applied discipline. | As 
psychologists were called upon to communicate 


and interpret their insights and research findings 
to lay persons, it was necessary to make decisions 
about what constituted proper professional be- 
havior, since it was evident that the misuse of 
such information would reflect directly on the 
community of psychologists, As a growing num- 
ber of our research efforts are viewed as useful 
to society, the problem of effective and honest 
communication becomes magnified, although its 
essential nature does not change. Recently, to 
our dismay, the announcement of a commercial 
application of long established psychological prin- 
ciples has assumed nightmarish qualities, and we 
find ourselves unwillingly cast in the role of 
invaders of personal privacy and enemies of so- 
ciety. A kind of guilt by association seems to 
be occurring, and, as future incidents of this kind 
will, it threatens to undermine the public relations 
we have built with years of caution and concern 
for the public welfare. The highly emotional 
public reaction to the “discovery” of subliminal 
perception should serve as an object lesson to our 
profession, for in the bright glare of publicity we 
can see urgent ethical issues as well as an omen 
of things to come. When the theoretical notion 
E = MC? became the applied reality of an atom 
bomb, the community of physicists became deeply 
concerned with social as well as scientific re- 
sponsibility. Judging from the intensity of the 
public alarm when confronted with a bare mini- 
mum of fact about this subliminal social atom, 
there exists a clear need for psychologists to 
examine the ethical problems that are a part of 
this era of the application of their findings. 

The vehemence of the reaction to the proposed 
use of a device to project subliminal, or from the 
public’s point of view “hidden,” messages to 
viewers indicates that the proposal touches a 
sensitive area. One of the basic contributors to 
this reaction seems to be the feeling that a tech- 
nique which avowedly tampers with the psycho- 
logical status of the individual ought to be under 
the regulation or control of a trusted scientific 
group. As a professional group, psychologists 
would fit this description, for in the Ethical Stand- 
ards of Psychologists (2) there is.a clear state- 
ment of their motives and relationship to society: 


Principle 1.12-1 The Psychologist’s ultimate 
allegiance is to society, and his professional be- 
havior should demonstrate an awareness of his 
social responsibilities. The welfare of the 
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profession and of the individual psychologist 
are clearly subordinate to the welfare of the 
public. ... 

Both this statement and the long record of re- 
sponsible behavior of the members of the profes- 
sion would certainly seem to be sufficient to re- 
duce any anxiety the public might have over the 
possible unscrupulous use of this or any other 
device. It is precisely the fact that the public is 
aware that decisions about the use of subliminal 
perception devices rest not with psychologists but 
with commercial agencies that may be distressing 
to the public. The aura of open-for-business 
flamboyance and the sketchily presented percent- 
ages in the first public announcement tended to 
reinforce existing apprehensions rather than allay 
them. 

Although subliminal perception happens now to, 
be the focus of a great deal of reaction, it is 
merely the most recent in a succession of perturb- 
ing events to which the public has been exposed. 
It has become the focus of, and is likely to be- 
come the whipping boy for, a host of techniques 
which now occupy the twilight zone of infringe- 
ment of personal psychological freedom. It must 
be remembered that to the lay person the notion 
of an unconscious part of the “mind” is eerie, 
vague, and more than a little mysterious. Unable 
fully to comprehend the systematic and theoret- 
ical aspects of such a concept, he must be con- 
tent with overly popularized and dramatic ver- 
sions of it. In every form of mass media the 
American public has been exposed to convincing 
images of the bearded hypnotist (with piercing 
eye) who achieves his;nefarious ends by controll- 
ing the unconscious of his victim. It has been 
treated to the spectacle of the seeming reincarna- 
tion of Bridey Murphy out of the unconscious 
of an American housewife and, in Three Faces 
of Eve, to complex multiple personalities hidden 
in the psychic recesses of a single individual. 
With such uncanny and disturbing images as an 
emotional backdrop, the appearance of The 
Hidden Persuaders on the best seller lists formed 
the indelible impression of the exploitation of the 
unconscious for purposes of profit and personal 
gain. In combination, this growth of emotionally 
charged attitudes toward the unconscious and the 
suspicions about commercial morality came to 
be a potentially explosive set of tensions which 
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was triggered off by the first commercial use of 
subliminal techniques. 

What is to be the psychologist’s position in 
regard to future developments with subliminal 
perception? The apparent discrepancy between 
the claims being made for the technique and the ` 
available research evidence suggests a need for 
considerable scientific caution as well as exten- 
sive investigation. The responsibility of psychol- 
ogists in this instance is clearly indicated in the 
code of ethics: 


Principle 2.12-1 The psychologist should re- 
fuse to suggest, support, or condone unwar- 
ranted assumptions, invalid applications, or un- 
justified conclusions in the use of psychological 
instruments or techniques. 

The flurry of claim and opinion about the effec- 
tiveness of subliminal methods seems to be based 
more on enthusiasm than controlled scientific 
experimentation, and it is here that psychology 
can be of service. Until acceptable ‘scientific 
answers are forthcoming, we believe psychologists 
should guard against a premature commitment 
which might jeopardize public respect for them. 
The course of scientific history is strewn with the 
dessicated remains of projects pursued with more 
vigor than wisdom. 

Scientific caution is essential, but it falls short 
of meeting the ethical issue raised by the nature 
of subliminal perception itself. The most strident 
public objections have been directed toward the 
possibility that suggestions or attempts to in- 
fluence or persuade may be administered without 
the knowledge or consent of the audience. s- 
surances that widespread adoption of this tech- 
nique would provide increased enjoyment through 
the elimination of commercial intrusions, or that 
the users will establish an ethical control over the 
content of the messages presented, can only fail 
to be convincing in the light of past experience. 
The suggestion that the public can be taught 
means of detecting when it is being exposed to a 
planned subliminal stimulation is far from reassur- 
ing since such a suggestion implies that the ability 
to defend oneself warrants being attacked. A 
captive audience is not a happy audience, and 
even the plan to inform the viewers in advance 
concerning the details of what is to be presented 
subliminally may not prevent the public from re- 
acting to this technique as a demand that it sür- 
render an additional degree of personal freedom. 
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Fresh from similar encounters, the public may not 
allow this freedom to be wrested from it. 

Finally, the argument that a great deal of our 
normal perception occurs on the fringe of con- 
scious awareness and that subliminal events are 
no more effective than weak conscious stimuli 
rests on opinion and not fact. This seems partic- 
ularly dangerous clinical ground on which to tread 
since the effect, on behavior, of stimuli which may 
possibly be inserted directly into the unconscious 
has yet to be explored. Assurances that this 
technique can only “remind” a person of some- 
thing he already knows or “support” a set of urges 
already in existence but cannot establish a com- 
pletely new set of urges or needs are reckless 
assertions having no evidence to support them. 
So it seems that the aspect of subliminal projec- 
tion which is marked by the greatest potential risk 
to the individual’s emotional equilibrium is the 
aspect about which the least is scientifically 
known. 

The psychologist’s ethical quandary, then, stems 
directly from the inescapable implication of devi- 
ousness in the use of such a technique. The ap- 
propriate guidelines for conduct are provided in 
this ethical statement: 


Principle 2.62-2 It is unethical to employ 
psychological techniques for devious purposes, 
for entertainment, or for other reasons not con- 
sonant with the best interests of a client or with 
the development of psychology as a science. 


It is obvious that “devious purposes” and “the 
best interests . . . of psychology as a science” are 
not self-defining terms and must be interpreted by 
the individual psychologist in the light of the 
circumstances of each situation. It is a trying 
and complex decision to make. If in his mature 
judgment the intended uses of the principles of 
subliminal perception do not meet acceptable 
ethical standards, the psychologist is obligated to 
dissociate himself from the endeavor and to labor 
in behalf of the public welfare to which he owes 
his first allegiance. In this respect, the responsi- 
bility of the social scientist must always be that 
of watchdog over his own actions as well as the 
actions of those to whom he lends his professional 
support. 

The furor which promises to accompany the 
further application of a variety of devices involv- 
ing subliminal perception is certain to embroil 
psychology in a dispute not of its own choosing. 
The indiscriminate and uncontrolled application 
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of psychological principles is increasing at a fear- 
some rate in the form of motivation research, 
propaganda, public relations, and a host of other 
“useful” practices based on the work of psycholo- 
gists. In a very real sense this era of applied 
psychology will be a test of the workability of the 
psychologist’s code of ethics and promises to 
stimulate the profession to give further consider- 
ation to its responsibility for assisting society to 
use its findings wisely. 
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HUMAN MOTIVES AND THE 
CONCEPT OF THE SELF * 


Ernest R. HILGARD* 


No problems are more fascinating than those 
of human motivation, and none are more in need 
of wise solution. To understand the struggles 
which go on within economic enterprise, to inter- 
pret the quarrels of international diplomacy, or 
to deal with the tensions in the daily interplay 
between individuals, we must know what it is 
that people want, how these wants arise and 
change, and how people will act in the effort to 
satisfy them. 

American psychologists typically believe that 
adult motivational patterns develop through the 
socialization of organic drives. Our preference 
for such an interpretation is understandable be- 
cause our science is rooted in biology. Man is 
assuredly a mammal as well as a member of so- 
ciety, and we begin to understand him by study- 
ing what he has in common with other animals. 
When we accept as the biological basis for moti- 
vation the drives present at birth or developing 
by maturation, it is natural to think of the learned 
social motives as grafted upon these or in some 
way derived from them. Despite the variations 
in the detailed lists of primary drives which dif- 
ferent ones of us offer, and some alternative 
conceptions as to the ways in which socialization 
takes place, we find it easy to agree that adult 
motives are to be understood through an inter- 
action between biology and culture. 


* Reprinted by permission from The American 
Psychologist, September, 1949, Vol. 4, No. 9, 374- 
382. 

1 Address of the president of the American Psy- 
chological Association at Denver, September 6, 1949. 
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Without reviewing any further the genetic de- 
velopment of motives, I wish to turn to some of 
the problems arising as we attempt to under- 
stand how these motives affect conduct. In our 
textbooks there is usually some important ma- 
terial left over after we have finished the chap- 
ters on physiological drives and social motives. 
I refer to the problems raised by the so-called 
defense mechanisms or mechanisms of adjust- 
ment. 


THE MECHANISMS OF ADJUSTMENT IN 
MOTIVATIONAL THEORY 


The mechanisms of adjustment were the fea- 
tures of Freudian theory that we earliest domes- 
ticated within American academic psychology. 
They now have a respectable place in our text- 
books, regardless of the theoretical biases of our 
textbook writers. 

The mechanisms did not burst all at once upon 
the psychological scene. Freud had begun to 
write about them in the ’90’s, and by the time of 
his Interpretation of dreams (1900) he had 
named repression, projection, displacement, iden- 
tification, and condensation. In his Three con- 
tributions to the theory of sex (1905) he added 
fixation, regression, and reaction formation. It 
remained for Ernest Jones to give the name 
rationalization to that best-known of the mecha- 
nisms, He assigned this name in an article in 
the Journal of Abnormal Psychology in 1908. 
Among the books which brought the mechanisms 
together and called them to the attention of 
psychologists none was more popular than Ber- 
nard Hart’s Psychology of insanity, which ap- 
peared in 1912 and went through several editions 
and many reprintings. Hart treated especially 
the manifestations of identification, projection, 
and rationalization, and introduced that by now 
familiar friend, logic-tight compartments. 

It remained for Gates to collect the mecha- 
nisms into a list in a textbook intended for the 
general student. The evolution of his chapter on 
mechanisms is itself instructive by showing how 
styles change in psychology. In his Psychology 
for students of education (1923), Gates called 
the chapter “The dynamic role of instincts in 
habit formation.” In the first edition of his 
Elementary psychology (1925) he changed the 
title to “The dynamic role of the dominant human 
urges in habit formation.” Then in the next edi- 
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tion (1928) he used the contemporary sounding 
title: “Motivation and adjustment.” The con- 
tent of the chapter underwent only minor revi- 
sions with these changes in title. These widely 
used books did much to place the mechanisms 
on the tips of the tongues of psychology students 
and professors twenty years ago, for by that 
time the mechanisms were already part of the 
general equipment of psychology, and not re- 
served for abnormal psychology or the clinic. 

Some of the tendencies found in Gates’ early 
treatment have persisted in more recent discus- 
sions of the mechanisms. For one thing, we took 
over the mechanisms when as a profession we 
were hostile to other aspects of psychoanalytic 
teaching. As a consequence, we often gave only 
halting recognition to their psychoanalytic origins. 
Nearly all the mechanisms do in fact derive from 
Freud, Jung, Adler and their followers. Among 
the mechanisms in Gates’ 1928 list, psycho- 
analytic writers originated introversion, identifi- 
cation, rationalization, projection, defense mecha- 
nisms, and compensation. Yet Gates’ only men- 
tion of psychoanalysis was in some disparaging 
remarks about the “alleged adjustment by repres- 
sion to the unconscious,” an explanation of ad- 
justment which he rejected as neither true nor 
useful. 

In subsequent discussions of the mechanisms, 
textbook writers have seldom felt called upon to 
take responsibility for serious systematic treat- 
ment. In order to avoid a mere listing of mecha- 
nisms, many writers have attempted some sort of 
classificatory simplification, but there has been 
little agreement on which mechanisms belong 
together. Gates, for example, had included four 
mechanisms under rationalization: projection, 
sour grapes, sweet lemon, and logic-tight com- 
partments. He gave defense and escape mecha- 
nisms separate places, although psychoanalytic 
practice has been to consider all the mechanisms 
as forms of defense. Shaffer (79) separated ad- 
justments by defense from adjustments by with- 
drawing, but he took back much of the distinc- 
tion by treating withdrawing as a defense. In 
his recent books concerned with the mechanisms, 
Symonds (23, 24) provides a rich collection of 
descriptive material, frankly psychoanalytic in 
orientation, but he succeeds little better than 
those who preceded him in giving a unified treat- 
ment of the mechanisms in relation to motivation. 
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The lack of systematic treatment of the mecha- 
nisms has had consequences for their develop- 
ment as part of psychological science. When 
there is no effort to be systematic, problems are 
not sharply defined. When problems are not 
sharply defined, anecdotal evidence is used loosely 
and sometimes irresponsibly. A consequence is 
that very little evidence of experimental sort is 
introduced into the chapters on the mechanisms. 
This does not mean that evidence does not exist. 
It means only that problems have to be more 
carefully formulated before the relevance of exist- 
ing evidence is seen, and before gaps in knowl- 
edge are discovered which evidence can fill. 


THE MECHANISMS AND THE SELF 


It would take us too far afield to review the 
individual mechanisms at this time, and to con- 
sider evidence in relation to them. Instead, we 
may examine some of their most general char- 
acteristics, as they relate to motivational theory. 
These characteristics lend support to a thesis 
which I propose to defend: the thesis that all 
the mechanisms imply a self-reference, and that 
the mechanisms are not understandable unless we 
adopt a concept of the self. 

The thesis that the mechanisms imply a self- 
reference need come as no surprise. Psycho- 
analysts have thought of the mechanisms as pro- 
tecting the ego. Anna Freud’s book on the sub- 
ject bears the title: The ego and the mechanisms 
of defense (6). Non-psychoanalysts have occa- 
sionally endorsed a similar thesis. In their recent 
text, for example, Guthrie and Edwards have 
given a very straightforward account of the de- 
fense mechanisms. Although their text remains 
within the broad framework of behaviorism, they 
do not hesitate to relate the mechanisms to the 
ego. In fact they define defense mechanisms as 
“the reaction patterns which reestablish the ego” 
(7, page 137). 

Let us examine two of the characteristics of 
the mechanisms to see how the thesis of self- 
reference is implied. We may choose to view the 
mechanisms as defenses against anxiety, or we 
may see them as self-deceptive. 

1. The mechanisms as defenses against anxiety. 
The natural history of anxiety in relation to learn- 
ing has been much illuminated by the series of 
experiments with animal subjects performed by 


Mowrer, e.g. (13), Miller, e.g. '(12), and their 
collaborators. 

A white rat is confined in a rectangular box 
of one or more compartments. The animal can 
escape electric shock either by some action within 
the shock compartment (such as depressing a 
lever to shut off the current), or by escaping from 
the dangerous place (as by leaping a barrier). 
Both Mowrer and Miller find that in situations 
like this a new drive is acquired, sometimes called 
anxiety, sometimes called fear. This new drive 
can motivate learning very much like any other 
drive. They accept the general position that 
drive-reduction is reinforcing. Anything which 
reduces the fear or anxiety will reinforce the be- 
havior leading to this reduction. Thus any sort 
of activity or ritual which would reduce fear or 
anxiety might be strengthened. Such activities 
or rituals might have the characteristics of de- 
fense mechanisms. 

The natural history of anxiety, according to 
this view, is somewhat as follows. First, the or- 
ganism has experiences of pain and punishment— 
experiences to be avoided. These are followed 
in turn by threats of pain and punishment, which 
lead to fear of the situations in which such threats 
arise. Other situations are assimilated to these 
fear-provoking ones, so the added circumstances 
may lead to apprehension. Fears with these 
somewhat vaguer object-relations become known 
as anxiety states. Sometimes as the apprehensive 
state becomes more and more detached from par- 
ticular frightening situations, clinicians refer to 
it as a state of free-floating anxiety. All of these 
acquired states of fear, apprehension or anxiety 
are tension-states. Any one of them may serve 
as an acquired drive and motivate learning. Ac- 
tivities which lessen fear and anxiety are rein- 
forced because tension is reduced, Thus be- 
havior mechanisms become reinforced and 
learned as ways of reducing anxiety. 

The Mowrer-Miller theory of the origin of fear, 
and of its role as an acquired drive, is acceptable 
as far as it goes. But it needs to be carried one 
step further if it is to deal with the kinds of anxiety 
which are found in the clinic. This step is needed 
because in man anxiety becomes intermingled 
with guilt-feelings. The Mowrer and Miller ex- 
periments with animals carry the natural history 
of anxiety through the stages of fear and appre- 
hension, but not to the stage of guilt-feelings. 
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In many cases which come to the clinic, the 
apprehension includes the fear lest some past 
offense will be brought to light, or lest some act 
will be committed which deserves pain and pun- 
ishment, It is such apprehensions which go by 
the name of guilt-feelings, because they imply 
the responsibility of the individual for his past or 
future misbehavior. To feel guilty is to conceive 
of the self as an agent capable of good or bad 
choices. It thus appears that at the point that 
anxiety becomes infused with guilt-feelings, self- 
reference enters. If we are to understand a 
person’s defenses against guilt-feelings, we must 
know something about his image of himself. 
This is the kind of argument which supports the 
thesis that if we are to understand the mecha- 
nisms we shall have to come to grips with a con- 
cept of the self. 

2. The mechanisms as self-deceptive. Another 
way of looking at the mechanisms is to see them 
as bolstering self-esteem through self-deception. 

‘ There is a deceptive element in each of the 
mechanisms. Rationalization is using false or dis- 
torted reasons to oneself as well as to the world 
outside; using reasons known to be false in order 
to deceive someone else is not rationalization but 
lying. It is entirely appropriate to consider self- 
deception as one of the defining characteristics 
of a mechanism. As another example of what 
I mean, let us consider when aggression should be 
thought of as a mechanism. Aggressive behavior 
which is a form of fighting directly for what you 
want or as a protest against injustice is not a 
mechanism at all, even if it is violent and de- 
structive. It is then simply a direct attempt at 
problem-solving. But displaced aggression has 
the characteristics of a mechanism, because false 
accusations are made, and the object of aggres- 
sion may be related only remotely to the source 
of the need to express aggression. Displaced 
aggression thus contains the elements of self- 
deception, and fits the pattern of the mechanisms. 

There are two chief ways in which we de- 
ceive ourselves. One is by denial of impulses, 
or of traits, or of memories. The second is 
through disguise, whereby the impulses, traits, 
or memories are distorted, displaced, or con- 
verted, so that we do not recognize them for 
what they are. Let us see what evidence there 
is for denial and for disguise. 

The clearest evidence for denial comes through 


amnesia, in which memories are temporarily lost. 
If such memories can later be recovered without 
relearning, support is given to an interpretation 
of forgetting as a consequence of repression. 
Often in amnesia the memories lost are the per- 
sonal ones, while impersonal memories remain 
intact. 

The man studied by Beck (2), for example, 
had no trouble in carrying on a conversation, in 
buying railroad tickets, or in many other ways 
conducting himself like a mature adult with the 
habits appropriate to one raised in our culture. 
It is a mistake to say that he lost his memory, 
for without memory he would have been unable 
to talk and make change and do the other things 
which are based upon past experience with arbi- 
trary symbols and meanings. But he did lose 
some of his memories. He could not recall his 
name, and he could not recall the incidents of 
his personal biography. The highly selective na- 
ture of the memory loss is an important feature 
of many amnesias. Under treatment, the man 
referred to recovered most of his memories, ex- 
cept for one important gap. This gap was for 
a period in his career in which he conducted 
himself in a manner of which he was thoroughly 
ashamed. 

Disguise, as the second form of self-deception, 
shows in many ways. The most pertinent evi- 
dence from the laboratory comes in the studies 
of projection defined as the attribution of traits. 
Undesirable traits of his own of which the per- 
son prefers to remain unaware are assigned in 
exaggerated measure to other people (/8). In 
some cases, the deception goes so far as to become 
what Frenkel-Brunswik calls “conversion to the 
opposite.” In one of her studies (4) it was found 
that a person who said, “Above all else I am 
kind,” was one likely to be rated unkind by his 
acquaintances. In the studies of anti-Semitism 
which she later carried on collaboratively with 
the California group she presents evidence that 
anti-Semitism is sometimes a disguise for deep- 
seated attitudes of hostility and insecurity having 
to do with home and childhood, and nothing to 
do directly with experience with Jews (5). 

If self-deception either by denial or by disguise 
is accepted as characteristic of a mechanism, the 
problem still remains as to the source of or reasons 
for the self-deception. The obvious interpretation 
is that the need for self-deception arises because 
of a more fundamental need to maintain or to 
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restore self-esteem. Anything belittling to the 
self is to be avoided. That is why the memories 
lost in amnesia are usually those with a self-refer- 
ence, concealing episodes which are anxiety or 
guilt-producing. What is feared is loss of status, 
loss of security of the self. That is why aspects 
of the self which are disapproved are disguised. 

In this discussion of the mechanisms I have 
tried to point out that they may be integrated with 
other aspects of motivation and learning provided 
their self-reference is accepted. Then it can be 
understood how they provide defenses against anx- 
iety, and why they are self-deceptive through de- 
nial and disguise. 


THE SELF PRESENT IN AWARENESS 


The mechanisms are comprehensible only if we 
accept a conception of the self. This poses us 
the problem of the nature of the self-concept that 
we may find acceptable. Two main approaches 
lie before us. One approach is to look for the 
self in awareness, to see if we can find by direct 
observation the self that is anxious, that feels 
guilty, that tries various dodges in order to main- 
tain self-respect. The second approach is to infer 
a self from the data open to an external observer, 
to construct a self which will give a coherent ac- 
count of motivated behavior. Let us examine 
these two possibilities in turn. 

We enter upon the task of discovering the self 
in awareness with the warnings from past failures. 
Any naive person who started out to develop a 
psychology of the self would expect to find the 
task relatively easy because self-awareness seems 
to be commonplace. Everybody knows that peo- 
ple are proud or vain or bashful because they are 
self-conscious. But the psychologist knows that 
this self-evident character of self-awareness is in 
fact most illusive. You presently find yourself as 
between the two mirrors of a barber-shop, with 
each image viewing each other one, so that as 
the self takes a look at itself taking a look at 
itself, it soen gets all confused as to the self that 
is doing the looking and the self which is being 
looked at. As we review the efforts of Miss 
Calkins (3) and her students to demonstrate that 
there was a self discoverable in every act of intro- 
spection, and find how little convinced Titchener 
and his students were, we are well advised not to 
enter that quarrel with the same old weapons. 
Introspection was taken seriously in those days 
and psychologists worked hard at it. There is 


little likelihood that we can succeed where they 
failed. 

Their difficulty was not due to the insistence 
upon trained observers. Self-observation of a 
much freer type by naive subjects is little more 
satisfactory. Horowitz’ study of the localization 
of the self as reported by children was not very 
encouraging in this respect (9). Children located 
their selves in the head or the stomach or the 
lower jaw or elsewhere, each individual child be- 
ing reasonably consistent, but the whole picture 
not being very persuasive as to the fruitfulness of 
an approach through naive self-observation. 

But the reason for rejecting a purely introspec- 
tive approach to the search for the self is not lim- 
ited to the historical one that earlier attempts have 
proved fruitless. It is based also on the recog- 
nition that defense. mechanisms and self-deception 
so contaminate self-observation that unaided in- 
trospection is bound to yield a distorted view of 
the self. 

Having said all this by way of warning, we may 
still allow some place for self-awareness in arriv- 
ing at our concept of the self. Two aspects of the 
self as seen by the experiencing person appear to 
be necessary features in understanding self-organi- 
zation. 

The first of these is the continuity of memories 
as binding the self, as maintaining self-identity. 
To the external observer, the continuity of the 
bodily organism is enough to maintain identity, 
but the person himself needs to have continuous 
memories, dated in his personal past, if he is to 
have a sense of personal identity. One of the 
most terrifying experiences in the clinical litera- 
ture is the state known as depersonalization, in 
which experiences are no longer recognized as 
belonging to the self. Break the continuity of 
memories and we have dissociation, split personali- 
ties, fugue states, and other distortions of the self. 

The second feature of self-awareness which 
cannot be ignored in forming our concept of the 
self is that of self-evaluation and self-criticism. 
I earlier pointed out that we need to understand 
the feelings of guilt which go beyond mere anx- 
iety. Guilt-feelings imply that the self is an active 
agent, responsible for what it does, and therefore 
subject to self-reproof. The other side of self- 
evaluation is that the self must be supported and 
must be protected from criticism. One compo- 
nent of the self is provided by those vigilant atti- 
tudes which are assumed in order to reduce anx- 
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iety and guilt. It is this vigilant self-criticism in 
its harshest form which is implied in Freud’s con- 
cept of the superego. Evaluative attitudes toward 
the self, including both positive and negative self- 
feelings, come prominently to the fore in the inter- 
views recorded by Rogers and his students (16, 
17). 

Another way of putting this is to state that the 
self of awareness is an object of value. McDou- 
gall referred to the sentiment of self-regard as in 
some sense the master sentiment. Murphy, 
Murphy, and Newcomb put it tersely: “The self 
is something we like and from which we expect 
much” (15, page 210). Perhaps I might amend 
the statement to read: “To some people the self 
is something they dislike and from which they 
expect little.” In any case it is an object about 
which attitudes of appreciation and depreciation 
are organized. Snygg and Combs state as the 
basic human need the preservation and enhance- 
ment of the phenomenal self (21, page 58). It 
would be easy to multiply testimony that one of 
the fundamental characteristics of self-awareness 
is an evaluative or judging attitude toward the 
self, in which the self is regarded as an object of 
importance, and preferably of worth (J, 14, 20). 

Despite the difficulties in introspective ap- 
proaches to the self, we find that our self-concept 
needs to include some information based on pri- 
vate experience. The continuity of memories 
maintains personal identity, and the awareness of 
the self as an object of value organizes many of 
our attitudes. More is needed, however, to en- 
rich the concept of the self and to make it square 
with all that we know about human motivation. 


THE INFERRED SELF 


This points up the need for a more inclusive 
self-concept, one which will make use of all the 
data, Such a self-concept I shall call the inferred 
self. Like any other scientific construct, it will 
prove to be valid to the extent that it is system- 
atically related to data, and it will be useful to the 
extent that it simplifies the understanding of 
events. 

I wish to suggest three hypotheses needed in 
arriving at an inferred self. Each of these, al- 
though plausible, is not self-evident, and there- 
fore requires demonstration. In order to be sci- 
entifically useful, it is important that the inferred 
self should go beyond the obvious. The inferred 
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self will prove acceptable only if these hypotheses, 
or closely related ones, are supported. 

The first hypothesis is that of the continuity of 
motivational patterns. This means that the or- 
ganization of motives and attitudes that are cen- 
tral to the self is one which persists and remains 
recognizable as the person grows older. Reactions 
to present situations will be coherent with reac- 
tions to past situations. For those who prefer 
the habit concept, the inferred self may be thought 
of as a pattern of persisting habits and attitudes. 
The organization or structure which is implied is 
a learned one, and like any habit structure it car- 
ries the marks of the past in the present. When 
new goals are substituted for old ones, there is 
continuity with the past in the ways in which the 
goals are selected and in the ways in which grati- 
fication is obtained. This is all plausible, but it is 
by no means self-evident, and it is greatly in need 
of empirical study. It is a matter for study and 
demonstration whether or not a continuity can be 
traced between nursing arrangements, thumb- 
sucking, nail-biting, cigarette-smoking, and overt 
sexual behavior. The first hypothesis implies that 
there is such a continuity, whatever motivational 
strands are being followed, so that one form of 
gratification shades imperceptibly into the next. 
If we but knew enough we could trace the con- 
tinuity throughout the life span. 

The second hypothesis supporting the inferred 
self is that of the genotypical patterning of mo- 
tives. This hypothesis suggests that motives un- 
Tike in their overt or phenotypical expression may 
represent an underlying similarity. It will do no 
good to try to appraise personality by a study 
confined to its superficial expression. What we 
know about the mechanisms of denial and dis- 
guise tells us that the genotypical pattern will have 
to be inferred. Unless we move at the level of 
inference and interpretation, much behavior will 
be baffling or paradoxical. 

The inferred self goes beyond the self of aware- 
ness by including for purposes of inference much 
that is excluded from self-awareness. Awareness 
includes the not-self as well as the self. In dreams 
and hallucinations we have products of the self, 
present in awareness, but products for which the 
self takes neither credit nor responsibility. It is 
hard to see the self as giving the stage-directions 
for the dream, or as selecting the epithets hurled 
by the hallucinated voices. Yet in making a re- 
construction of genotypical motives, these prod- 
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ucts of the self enter as evidence. Some items, 
then, remain in awareness, but are not part of self- 
awareness. Other items are excluded from aware- 
ness by inattention or amnesia. Facts such as 
these necessitate indirection in the inference to 
motivatiohal organization. A description of overt 
conduct is not enough to permit an accurate ap- 
praisal of motivational patterning. 

These assertions may be made with some con- 
fidence, but again confidence of assertion does not 
constitute proof. We need to show by rigorous 
proof that predictions based on the concept of 
genotypical patterning of motives will account for 
behavior either more economically or more ac- 
curately than predictions based on phenotypical 
manifestations of motivated action. 

The third hypothesis is that the important 
human motives are interpersonal both in origin 
and in expression. Despite the fertility of Freud’s 
mind and the penetration of his observations, this 
is one hypothesis about the self which he never 
fully grasped. By good fortune he laid his em- 
phasis upon the one organic need—sex—which 
is inevitably interpersonal in its fullest expression. 
Even so, he remained within the instinct tradition. 
Once we reject the self as the unfolding of an 
inevitable pattern, but see it instead as an indi- 
vidual acquisition, we are impressed by the part 
which other people play in the shaping of an in- 
dividual self. Because the parents and others who 
transmit the culture are themselves a part of the 
culture, there are some uniformities in socializa- 
tion, producing pressure in the direction of a 
modal personality (10). In addition, there are 
diverse roles which are ready-made for the indi- 
vidual, to which he conforms with greater or less 
success. There are the roles of man and of 
woman, of eldest and youngest child, of mother 
and father and in-law, of employer and employee, 
of craftsman and white-collar worker. Finally 
there are the individualizing influences of hered- 
ity, of birth accidents, of childhood experiences. 
There are many details to be filled in, but there 
is little doubt about the general course of sociali- 
zation, leading in the end to internalizing much 
of the culture in the form of personal ideals and 
standards of conduct. 

The self is thus a product of interpersonal in- 
fluences, but the question remains whether the 
end-product is also interpersonal in its expression. 
Does the self have meaning only as it is reflected 
in behavior involving other people, either actually 


or symbolically? Is it true that you can describe 
a self only according to the ways in which other 
selves react to it? I am inclined to believe that 
the self, as a social product, has full meaning 
only when expressed in social interaction. But I 
do not believe that this is obvious, because I can 
conceive that it might not be true, or might be 
true in a limited sense only. 

These uncertainties about the truth of the hy- 
potheses regarding the inferred self need not be 
regarded as signs of weakness in the concept. On 
the contrary, the concept has greater potential 
richness of meaning precisely because it goes be- 
yond the self-evident and requires empirical study 
and justification. If it turns out that in some 
meaningful sense motivational patterns are con- 
tinuous, that we can unravel their genotypical 
organization, and that we can know in what pre- 
cise way they are interpersonal, then we will have 
a concept of an inferred self that will be genuinely 
useful. 

What does the inferred self imply as to the 
unity of personality? It does not necessarily im- 
ply unity. Conflict as well as harmony may 
be perpetuated through genotypical organization. 
The healthy self, however, will achieve an in- 
tegrative organization. Note that I say integra- 
tive and not integrated. It is the integrative per- 
sonality which can handle the complexity of rela- 
tionships with other persons in a culture like ours, 
a culture which makes plural demands. An inte- 
grated personality soon leads to its own isolation 
or destruction if it is not also integrative. Lest 
this seem to be an idle play on words, let me point 
out that the paranoid psychotic with highly sys- 
tematized delusions is among the best integrated 
of personalities. He is integrated but not in- 
tegrative. The genotypical patterns of motiva- 
tion which comprise the inferred self may or may 
not be integrative. 


A LABORATORY FOR THE STUDY 
OF RSYCHODYNAMICS 


I have argued that we need a self-concept if 
we are to understand the richness of human mo- 
tivation, and I have proposed that we adopt an 
inferred self as the unifying concept. Now what 
shall we do about it? 

Perhaps this all sounds very much like clinical 
psychology, so that the answer might come: 
“Leave it to the clinicians.“ I believe this to be 
the wrong answer, not because I have any lack of 
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confidence in clinicians, but because I believe it 
represents a faulty conception of the appropriate 
division of labor within psychology. The prob- 
lems of human motivation and personality belong 
to all psychologists. The problems of the self- 
concept are general problems of psychological 
science. 

Instead of assigning these problems to any one 
group of psychologists, I propose that we proceed 
to establish laboratories for the study of psycho- 
dynamics fully commensurate with laboratories 
for the study of perception or learning or other 
problems of general psychology (11). 

A laboratory for the study of psychodynamics 
differs from the clinic in its intent, though there 
will be overlap in staff, in procedures, and in 
problems. I am assuming that people are referred 
to a clinic or come there voluntarily in order to 
be helped with their personal problems. By con- 
trast, subjects are invited to come to a laboratory 
because they fit into an experimental design. The 
laboratory permits delimitation of problems and 
control of variables in a manner usually less pos- 
sible in a situation geared to service. 

In order to make the picture of the psycho- 
dynamics laboratory concrete, we may sketch a 
few specimen problems likely to be worked upon. 
Many of these problems will have had their origin 
in clinical experience, and many fruitful hypoth- 
eses will have come from the clinic. But the task 
of achieving precision in the testing of scientific 
generalizations belongs to the laboratory. 

Of first moment are the problems involved in 
the natural history of the self. This will mean 
concentrated study of young children, under ar- 
rangements which permit the testing of hypoth- 
eses, For many years we have given assent to the 
importance of language as an instrument of so- 
cialization, but we have a paucity of data. Piaget 
asked many of the right questions, but his con- 
jectures have to be refined and put to the test in 
a manner more convincing to American psycholo- 
gists. I should assign the study of the child’s 
language as a task of high priority in the psycho- 
dynamics laboratory. This is but one aspect of 
discovering in what ways the self is a social prod- 
uct. 

Other problems include the details of influence 
by important people in the child’s environment. 
Some studies now under way at Stanford (8) 
suggest that patterns of sibling rivalry among 
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young children are often traceable to unresolved 
rivalries going back to the parents’ childhoods, 
A parent may act as a director of the drama, 
assigning the roles to the children, and calling the 
turns on a new performance that largely re-enacts 
one of a previous generation. While there is satis- 
factory evidence from case histories that this sort 
of parental influence goes on, just how it comes 
about, and just how the parent is protected from 
becoming aware of what is being done, need to be 
studied under laboratory-type controls. 

Another developmental problem worthy of care- 
ful exploration has to do with the magical ideas 
of childhood, sometimes referred to as the feeling 
of omnipotence. While the stubborn realities of 
the environment soon trim down the sense of 
power to more finite proportions, magical concep- 
tions continue even into adult life, influencing the 
interpretations of causal sequences. I do not refer 
simply to superficial manifestations, as in the 
prevalence of superstition. When the investigator 
begins to look, he finds that there are many ways 
in which individuals believe themselves to have 
magical powers, to be among the specially gifted, 
to be so precious as to be specially vulnerable, 
to be able to shape events through willing them 
to be. In a scientific age like ours, these magical 
ideas are taboo, and consequently may influence 
behavior while being largely out of awareness. 
If we understand this desire to gain satisfaction 
through the expression of magical power, we 
would better understand some of the most puz- 
zling aspects not only of an individual’s behavior, 
but of the dynamics of economic and political life. 

I have chosen these few illustrations (lan- 
guage, sibling rivalry, and the magic of power) 
to illustrate the sorts of problems which can be 
studied in arriving at a natural history of the self. 

Let me turn now to a set of problems in the 
answer to which experiments with animal sub- 
jects are particularly promising. These are defin- 
ing experiments on the concepts of anxiety, shame, 
and guilt. I have already referred to the excel- 
lent start made by experiments on fear and anx- 
iety in rats. It may take a more sociable animal, 
such as the dog, to exhibit the behavior we call 
shame. There is no doubt that the dog can act as 
if ashamed. I do not know whether or not a dog 
can act as if guilty. Shame may be thought of 
as a response to being caught by someone else in 
socially disapproved behavior; guilt may be 


PERCEPTION, THE SELF, AND PERSONALITY 


thought of as a response to catching yourself in 
behavior discordant with your own conscience. 
Can both shame and guilt go on outside of aware- 
ness, or is guilt alone subject to unconscious ex- 
pression? Is the concept of guilt applicable only 
to man? We need better definitions, but we also 
need to know what is the case. I should like to 
see the psychodynamics laboratory work on the 
problem of clarifying what is meant by anxiety, 
shame, and guilt, and instructing us about the 
principles according to which these processes 
occur. 

The psychodynamics laboratory is the place in 
which to make a direct study of the self-organiza- 
tion which permits conflicts within the self as 
dramatized in the Freudian notions of id, ego, and 
superego. This particular partitioning of the self 
is probably too’ rigid to be acceptable, but they 
are genuine problems which the partitioning is 
designed to explain, and these problems are still 
in need of explanation. Anna Freud suggests 
that under hypnosis the hypnotist sets aside the 
subject's ego. Others have suggested that the 
superego is soluble in alcohol. If appropriate 
hypotheses are clearly stated, it ought to be pos- 
sible to design experiments to test them by bias- 
ing the outcome of the wars within the self. That 
is, through appropriate techniques, perhaps using 
hypnosis or alcohol, one or the other of the 
fighters in the battle could be strengthened or 
weakened. Thus it should be possible to deter- 
mine with greater precision the nature of the par- 
ticipants in self-conflict. 

Another problem is that of rapport which arises 
because we need to know the circumstances un- 
der which a person can freely report private ex- 
perience with a minimum of distortion. Con- 
sider the following three situations. First is the 
administration of projective tests, say the Ror- 
schach or the TAT. It is assumed, rightly or 
wrongly, that rapport with the test-administrator 
can be established fairly promptly. It is also as- 
sumed, rightly or wrongly, that once rapport is 
established, responses are primarily to the stimu- 
lus cards rather than to the test administrator. 
All this needs study, but we may accept this situ- 
ation as involving a relatively low order of rap- 
port. Next in our scale is the ordinary inter- 
viewing situation, in which the subject, alone with 
the psychologist, reports private experiences. Here 
it is plausible to assume that rapport 1s more im- 
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portant than in the test situation, so that what 
the person reveals becomes more closely related 
to the inter-personal situation the interviewer is 
able to create. The third situation, with rapport 
at a maximum, is that of hypnosis, in which rap- 
port is exaggerated beyond that ordinarily found 
in the interviewing situation. These graded situ- 
ations provide an excellent series in which to 
study what rapport does to the possibility of re- 
porting personal experiences with varying degrees 
of distortion. 

Another problem is that of insight as a factor 
in personality reorganization. Here we have a 
problem directly pertinent to clinical practices, 
and to psychotherapy, but there is pertinence to 
general psychology also. How is the insight of 
which the psychotherapist speaks related to that 
of which the animal psychologist speaks? There 
is a similarity in that both have to do with sensible 
problem-solving, based on the ways in which situ- 
ations are perceived. 

In studying the achievement of insight we have 
an opportunity to compare the self present in 
awareness with the inferred self. What we mean 
by insight in this context is essentially that the 
self of which the person is aware comes to corre- 
spond to the inferred self—in other words, that 
the person comes to see himself as an informed 
other person sees him. This is what is meant 
by an objective attitude toward the self. The self 
may be granted the privilege of privacy, but even 
the view of the self held in private is such as 
could be communicated to a trusted outsider. 
This explains the enigmatic statement of the late 
Harry Stack Sullivan that one achieves mental 
health to the extent that one becomes aware of 
one’s inter-personal relations (22, page 102). 
When the relations to other people become com- 
municable to oneself and potentially to another, 
then these relations are no longer confused by the 
distortions of neurotic mechanisms. 

It is sometimes said that the mechanisms are 
blind and inflexible, little subject to the ordinary 
principles of learning (25). But they can be 
unlearned; this is, in fact, one of the chief tasks 
of psychotherapy. There are perhaps two main 
ways in which, through insight, the mechanisms 
can be defeated. 

The first of these methods is to become aware 
of the mechanisms, so that the person can catch 
himself using them. He may learn to interpret 
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his own headaches and his own outbursts of tem- 
per. Because he knows what he is doing, he is 
able to control his conduct. Insight here is into 
symptoms and the chain of events of which these 
symptoms are a part. Following insight the chain 
of events may be broken, so that the sequences 
do not flow to their usual conclusions. Guthrie 
has made use of a notion very like this (despite 
his discomfort with insight as a concept) in urg- 
ing that the way to gain control over a habit 
sequence is to identify the cues. By alienating 
these cues, the objectional habit sequence is in- 
terrupted. 

The second method overlooks the detailed ac- 
tion of the mechanisms entirely, while seeking in- 
sight into whatever has made the mechanisms 
necessary. There is a reevaluation of the self 
and its motives, a willingness to accept features 
of the self which were previously unacceptable. 
If more security can be achieved by abandoning 
the mechanisms than was achieved by them, they 
do not have to be fought. The mechanisms simply 
dissolve because they are no longer needed. 

It is important to know whether or not this is 
a two-stage process, or an interaction between two 
methods of solving the same problem. This is 
not something to be debated, but something to be 
studied and understood. 

We are ready today, as we might not have been 
a few years ago, to establish psychodynamic labo- 
ratories to attack and answer many of the ques- 
tions which I have raised. Such laboratories will 
provide opportunities for co-operation between 
experimental and clinical psychologists on prob- 
lems of mutual concern. The staff to be invited 
to work in these laboratories will include psychol- 
ogists with a variety of backgrounds, united in 
their acknowledgment that the search for the self 
is a significant scientific endeavor. 
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PERCEPTION, THE SELF, AND PERSONALITY 


CONCEPTUAL AND 
METHODOLOGICAL PROBLEMS 
IN INTERPERSONAL PERCEPTION * 


N. L. GAGE AND LEE J. CRONBACH? 


In studies of interpersonal perception, the proc- 
ess most often investigated has been given such 
names as “empathy,” “social sensitivity,” “accu- 
racy of social perception,” “insight,” and “diag- 
nostic competence.” Despite variations in ter- 
minology and method, the studies have similar 
aims. Knowledge about interpersonal perception 
is intended to be significant for social psychology 
and personality theory, as well as for practical 
problems in leadership, marital relations, clinical 
work, and teaching. Many difficulties, however, 
prevent clear interpretation of the results so far 
obtained. We attempt here to point out major 
pitfalls, to evaluate research procedures commonly 
used or recently advocated, and to suggest better 
designs for studies in this area. 


NEED FOR SHARPENED CONCEPTUALIZATION 


In studies of empathy and its sister traits, the 
basic variable has been only hazily conceptualized. 
This difficulty characterizes early research in any 
area; “intelligence,” “attitude,” and “adjustment” 
have all suffered from inadequacies of conceptu- 
alization comparable to those afflicting empathy. 
Writers have inadequately specified just what they 
mean to measure, or to what extent the variable 
they study overlaps the variables in other investi- 
gations. Thus, one test of empathy finds out how 
accurately subjects predict the ratings acquaint- 
ances will give them. Another test of empathy 
requires that subjects estimate the musical prefer- 


* Reprinted by permission from Psychological Re- 
view, 1955, Vol. 62, No. 6, 411-422. N 

1 One of the present writers has been engaged in 
empirical research on interpersonal perception, trying 
various testing techniques and methods of analysis 
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perception (2, 3). 
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ences of the average factory worker. Not surpris- 
ingly, these tests correlate only .02 (12). 

Implicit Assumption of Generality—One fun- 
damental question concerns degree of generality. 
Is understanding of others a highly generalized 
trait, or is it a collection of response patterns 
which have only a surface similarity? From the 
failure of many writers to delimit their concept, 
one gets the impression that they expect some 
people to be consistently good judges of others, 
and some people to be consistently poor judges; 
that is, a rather general trait is assumed to exist. 
If a Judge does well in predicting what response 
Others 1, 2, and 3 will give to stimuli a, b, and c, 
some investigators evidently would expect him to 
do well in predicting the responses of Others 4, 
5, and 6 to stimuli x, y, and z. Only an expecta- 
tion of this character would lead one to try pre- 
dictions of musical tastes as a possible gauge of 
the effectiveness of a foreman, or to accept a test 
of ability to predict responses of office workers as 
a parallel form to a test of ability to predict re- 
sponses of factory workers (15). 

A generalized trait such as “empathic ability” 
may profitably be used as a construct if changes 
in the individual’s behavior from situation to sit- 
uation are small compared to differences between 
individuals in the same situation. The fact that 
mental tests correlate positively makes “general 
mental ability” a useful concept (though we can 
also differentiate that concept into more specific 
subtraits). “Resistance to stress” appears to be 
much less general; the general trait must be re- 
placed by more specific traits describing resistance 
to particular stresses. 

So, perhaps, with accuracy in interpersonal per- 
ception. Accuracy in predicting another’s re- 
sponses in one situation does correlate with accu- 
racy in predicting another set of responses (7, 
21). But it is questionable whether this accuracy 
must be ascribed to an “empathic” process, or 
even to genuine understanding (4, 21). It is 


‚therefore critically important to know just when 


measurements of empathy in one situation justify 
generalization to other situations or to a construct 
transcending particular situations. Until a general 
“ability to understand others” is established, work- 
ers should proceed with great caution, and define 
in any theoretical statement or interpretation of 
results just what facet is being discussed. 
Nonequivalence of Alternative Operational Def- 
initions —One reason why empathy is inade- 
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quately conceptualized is that many investigators 
have been content to rely on a simple “operational 
definition.” Having invented a face-valid tech- 
nique to measure the adequacy with which one 
person could understand another, investigators 
have neglected to inquire into its meaning. En- 
tranced by the beauty of their operations, they 
have cloaked these limbs with barely enough con- 
ceptual analysis to provide some scientific respec- 
tability for their reports. 

Most recent studies of interpersonal perception 
require a Judge to predict the responses of an 
Other. The predictions are scored for accuracy 
against the actual responses or characteristics of 
the Other. The responses to be predicted and the 
experimental conditions for obtaining scores have 
varied greatly from one study to the next. To 
clarify what present tests deal with, and thereby 
indicate the possible subdivisions of the field, we 
draw attention to four components of the typical 
experimental design: 


a. The Judge whom the experimenter is at- 
tempting to measure. 

b. The Other(s) whom the Judge is asked to 
interpret. 

c. The Input, or information concerning the 
Other which is available to the Judge. 

d. The Outtake, i.e., the statements or predic- 
tions about the Other obtained from the 
Judge. 


The experimenter may, for example, decide to 
ask (a) kindergarten teachers to observe (b) 
children, and, using (c) cues obtained during ob- 
servation, to predict (d) the sociometric choices 
each pupil will make. 

Understanding another person may be regarded 
as having two stages, which suggest two continua 
for classifying investigations. First, the Judge 
must take in information, perhaps by observing 
the Other, or perhaps by dealing with him over. a 
period of time; the first continuum therefore deals 
with the degree cf acquaintance of the Judge with 
the Other. Second, the Judge must interpret the 
information in order to arrive at predictive state- 
ments; the second continuum therefore deals with 


the degree of extrapolation or inference required 
between Input and Outtake.2 An experiment may 


4 ? Meehl (17, pp. 68-71) has used a parallel dis- 
tinction in identifying two possible applications of the 
phrase “clinical intuition”: (a) to the situation in 
which the clinician cannot be articulate about the 
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be designed to make great demands on the intake 
process (little acquaintance) or the interpretative 
process (much extrapolation), or both, or neither. 
The extreme patterns are contrasted in Table 1. 

This table makes it clear that understanding of 
other persons demands different things of the per- 
ceiver in different situations. If we ask a person 
questions about Others where he has had ample 
opportunity to learn the answers by experience 
(Pattern A), we are primarily measuring his 
knowledge. When we present him with questions 
which he cannot answer on the basis of past 
experience alone, we are measuring ability to ac- 
quire new knowledge. But different abilities are 
required, depending upon whether the difficulty 
he faces is that of gathering information (Pattern 
B), or of drawing inferences (Pattern C), or both 
(Pattern D). A Judge who performs well in one 
pattern might perform badly in another. 

Classification of Studies According to Objects 
of Perception. —It is also necessary to inquire just 
what “Others” are involved in any hypothesis; 
unless this is clearly delimited, it can only be 
assumed that the investigator is interested in. a 
generalized ability to understand all other persons. 
Various studies have used quite different objects 
of perception, asking the Judge to predict: 


a. How persons in general will behave; 
b. How a particular category of persons devi- 
ates from the behavior of persons in general; 

c. How a particular group deviates from the 
typical behavior of the particular category it 
belongs to; 

. How an individual deviates from the typical 

behavior of the particular group he belongs 

to; 

How an individual on a particular occasion 

will deviate from his typical behavior. 


We can show that each of these types of under- 
standing may be useful. (a) General principles 
such as “All people have a need to be approved” 
are expectations which guide conduct. (b) The 


evidence for his diagnosis; (b) to that in which the 
clinician cannot “show in what manner a particular 
hypothesis was arrived at from the stated evidence.” 
These two aspects of intuition, namely, “evidence” 
and “manner of arriving at,” seem to resemble 
our “acquaintance” and “extrapolation,” respectively. 
With a high degree of acquaintance the judge would 
have a great deal of evidence, and a high degree of 
extrapolation would require what Meehl calls “the 
creative act of hypothesis-formation.” 
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individual forms expectations about different cate- 
gories of people: managers or labor leaders, for 
example. (c) The person discriminates within a 
category, to form expectations about a particular 
group he is associated with. An officer can make 
wise decisions about his men on the basis of a 
correct stereotype of enlisted men in general, but 
he can make even wiser decisions by taking the 
particular wishes of his own squadron into ac- 
count. (d) One next comes down to describing 
the unique behavior of the individual Other, as in 
clinical diagnosis. (e) The final step, prediction 
of differences within the individual over occasions, 
is illustrated when a therapist decides that in 
certain sessions it is better to review than to intro- 
duce new interpretations. 

Our five types of Others and four patterns point 
to 20 rather different ways in which an ability to 
understand Others may be defined. Though not 
all these combinations are equally significant, re- 
search plans and interpretations need to be speci- 
fied in terms of some such concepts as these. 


NEED FOR INTERPRETABLE SCORES 


Comparing the Judge’s predictions with the 
Other’s actual behavior readily yield an accuracy 
score, but this score is difficult to interpret be- 
cause a large number of processes may be postu- 


249 


lated to explain it. This problem may best be 
described if we treat for the moment the simple 
situation where the prediction and the Other’s 
actual behavior are reported dichotomously, and 
the prediction may therefore be scored as right 
or wrong. The conclusions would be modified 
only in detail if the score were based on magni- 
tude of errors. 

Controlling Effects of Real Similarity.—Con- 
sider the study where we have (a) the Judge’s 
self-description, (b) the Other’s self-description, 
and (c) the Judge’s prediction of b. The re- 
sponses to any item have three aspects: 


RS (real similarity): agreement of a and b 
AS (assumed similarity): agreement of a and c 
ACC (accuracy): agreement of b and c 


Only two of these three are independent rela- 
tions. That is, when two of these relations are 
known, the third may be inferred. Thus, if AS 
and RS on an item are scored 1 (denoting agree- 
ment), ACC must be 1. Scores for the three rela- 
tional variables are obtained by summing the 
values obtained on single items. Any score may 
be considered a resultant of the other two. What 
we regard our test as measuring therefore depends 
on how we choose to conceptualize the problem, 
as has been pointed out by Tagiuri, Blake, and 


TABLE 1 


Four TYPES or STUDIES OF INTERPERSONAL PERCEPTION 


Pattern A 


Pattern B 


Pattern C 


Judge-Other 
relationship 


Input-Outtake 
relationship 


Hypothesized 
process 


Illustration 


Quality repre- 
sented in accu- 
racy measure 


Much acquaintance 
Little extrapolation 


When acquainted with an 
Other, the Judge has many 
opportunities to observe him; 
some Judges habitually take 
better advantage of these op- 
portunities than do others, 
paying better attention to 
the Other and cumulating 
more information about him. 


Asking high school counsel- 
ors to agree or disagree that 
“The majority of adolescents 
say they have conflict with 
their parents” 


Knowledge from past experi- 
ence. 


Little acquaintance 
Little extrapolation 


Encountering a stranger, the 
Judge has some opportunity 
to observe him; some Judges 
are better able than others to 
take advantage of this brief 
opportunity, hence cumulat- 
ing more information, 


Having Judges interview 
strangers and then rate their 
command of English 


Ability to observe 


Much acquaintance 
Much extrapolation 


When acquainted with an 
Other, the Judge has many 
opportunitiesto observe him; 
some Judges are better able 
than others to use the infor- 
mation thus acquired, to- 
gether with some personality 
theory, to derive accurate 
statements about variables 
not observed directly. 


Having husbands predict 


personality test responses of 
their wives 


Ability to infer 


Pattern D 


Little acquaintance 


Much extrapolation 


Encountering a stranger, the 
Judge has some opportunity 
to observe him; some Judges 
are better able than others to 
use the information thus ac- 
quired, together with some 
personality theory, to derive 
accurate statements about 
variables not observed di- 
rectly. 


Asking clinicians to make 


predictions of scholastic suc- 
cess from projective tests 


Ability to observe and infer 
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Bruner (22). Empirical studies have reported 
relations between the scores—for example, that 
Judges more typical of a group have higher ac- 
curacy in judging members of that group. But 
this may result merely from the linkage repre- 
sented in the operations defining the scores, for 
when AS is constant and greater than RS, ACC 
and RS are correlated. Such a conclusion is a 
logical necessity, not a psychological finding re- 
garding any superior insight on the part of the 
more typical Judge. 

There is evidence that AS—perceiving Others 
as similar to oneself—is highly general over items. 
A person tends to assume similarity to the same 
degree throughout a questionnaire, despite marked 
variety in the apparent content of the items (20). 
Moreover, the tendency is somewhat general over 
preferred Others; if the Judge’s AS score toward 
one friend is high, it will probably be high when 
he predicts the behavior of another friend (20). 
AS relative to a liked person, however, does not 
predict whether AS will be high or low when the 
Judge predicts for a disliked person (20). There 
is some justification, then, for regarding differ- 
ences in the Judge’s AS from Other to Other as a 
reflection of the Judge’s attitude toward the 
Others (6). The AS score is to some extent a 
reflection of the Judge’s general attitude toward 
other persons. But probably the AS score is also 
influenced by the Judge’s set while taking the test; 
for example, Lundy (16) found that Judges who 
acquired facts about Other while interacting under 
a pay-attention-to-yourself set displayed more AS 
than did Judges instructed so as to have a pay- 
attention-to-the-Other set. 

While the ACC score has a simple operational 
definition, it clearly does not correspond directly 
to any simple construct or trait. One possible 
solution is to obtain separate estimates of more 
elemental component variables (cf. 2). In mak- 
ing such analyses, however, the investigator risks 
embracing new confusions as he divorces the old. 

Hastorf and Bender (13) have proposed to 
subtract AS from ACC (which they call “raw 
empathy”) to estimate “refined empathy.” This 
proposal has serious weaknesses, which may be 
clarified by considering the possible configura- 
tions of responses on dichotomous items. If a 
is the Judge’s self-description, b is the Other’s 
self-description, and c is the Judge’s prediction, 
the patterns shown in Fig. 1 are possible. (With 
more than two response alternatives per item, 
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the “WAD” cell contains two distinct patterns— 
b and c alike, or b and c different—which would 
modify the following argument.) 

Following Kelly and Fiske (14, p. 108), we 
see that the total ACC score over all items is 
WAS plus WAD. The AS score is WAS plus 
UAS. When a Judge is predicting an Other, we 
may regard the real similarity or real dissimilarity 
of this pair on any item as fixed independently 
of any social perception by the Judge. Now we 
may ask, within the real similarity (RS) items: 
If the Judge predicts correctly, is he accurate? 

and a=b 
Real Dissimilarity (RD) Real Similarity (RS) 


ame Unwarranted Warranted 
Assumed Assumed Assumed 
Similarity (AS) | Similarity (UAS) Similarity (WAS) 
a=c#b a=bec 
are Warranted Unwarranted 
Assumed Assumed Assumed 
Dissimilarity Dissimilarity (WAD) | Dissimilarity (UAD) 
(AD) 
a#b=c a=b#c 
a = Judge's self-description; b = Other's 
self-description; c = Judge's prediction 
Fic. 1. Possible combinations of assumed. and 


real similarity of any dichotomous item. 


Or does he assume similarity? Obviously, these 
questions are operationally identical. The count 
of such items represents “warranted assumed 
similarity,” and there is no way to distinguish 
whether this represents the mental set to assume 
similarity or the ability to judge accurately. In 
the Bender-Hastorf correction procedure, sub- 
tracting AS from ACC, we find that AS on RS 
items cancels ACC on RS items. Thus the RS 
items do not enter the refined empathy score. 
Among real dissimilarity (RD) items where 
he predicts correctly, we might ask: Does the 
Judge recognize the dissimilarity or does he as- 
sume dissimilarity? These questions are both 
reflected in the count of WAD items. The 
Bender-Hastorf refined empathy score is equal 
to WAD — UAS. Therefore, the refined empathy 
score was a perfect negative correlation with AS, 
when RS is held constant. Furthermore, it has 
higher range when Judge and Other are dissimilar. 
Clearly, Bender and Hastorf did not arrive at 
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a measure of accuracy independent of AS and 
RS. 

The four categories of items in Fig. 1 have 
two degrees of freedom after the total number 
of RS items for a Judge-Other pair is established. 
We can take out two scores, and would like those 
scores to be independent. What score will be 
most meaningful depends on the correlation be- 
tween the various cells. No single cell yields a 
good score as it stands, for the cell entry is 
influenced by RS. 

One possibility is to employ the ratios 
WAS/RS and WAD/RD to summarize our in- 
formation about the Judge. This procedure re- 
quires that enough items be used to keep the 
denominator large; otherwise, of course, the ratio 
becomes unreliable. In any case, however, the 
ratios for different Judges will be based on dif- 
ferent items; this removes what may be an essen- 
tial experimental control. The correlation be- 
tween WAS/RS and WAD/RD should be deter- 
mined. If these components are positively cor- 
related, it follows that individual differences in 
prediction are more strongly determined by dif- 
ferences in ACC than by differences in AS tend- 
ency. The correlation will be negative if indi- 
vidual differences in prediction are more strongly 
influenced by AS than by ACC. 

Distinguishing Stereotype and Differential Ac- 
curacy.—The accuracy score may be divided in 
another manner (2, 7), yielding components 
which we may refer to as “stereotype accuracy” 
and “differential accuracy.” The former refers 
to the individual’s ability to predict the pooled 
responses of a given category of persons, whereas 
the latter refers to his ability to differentiate 
among individuals within the category. 

Whatever score is used should reflect accuracy 
in predicting an Other at the intended level of 
specificity, If we are asking a Judge to predict 
the response of an individual to a personality in- 
ventory, we are probably interested in the fourth 
of our five types of Others, and want to measure 
“ability to predict how this individual deviates 
from the typical behavior of the particular group 
he belongs to.” If accuracy is scored directly by 
comparing the prediction to the response of the 
individual, we are not distinguishing between two 
components which contribute to the Judge’s ie 
cess: his knowledge of the response that any indi- 
vidual in the subgroup is likely to give, and his 
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knowledge of the way in which this individual 
deviates from the norm. 

It is apparently desirable, when studying abil- 
ity to predict at any one level, to obtain at least 
two scores: (a) ability to predict the typical be- 
havior in the next-larger class to which Other 
belongs, and (b) ability to predict how Other 
deviates from the norm for this class. This would 
apply whether Other refers to an individual, a 
particular Army squadron, or some category such 
as “education majors.” 

There are three ways to measure differential 
and stereotype accuracy. 

1, Where the Judge predicts the response of 
several Others, it is possible to determine the re- 
sponse of the average Other on each item, and 
the average of the Judge’s predictions on that 
item. Thus we form an average profile of re- 
sponses, and an average predicted profile. The 
distance between these two, possibly after remov- 
ing differences in over-all average response, is a 
measure of stereotype accuracy (2). Here the 
stereotype that the Judge holds is inferred from 
his responses over many Others. 

2. Where the Judge predicts for several Others, 
we can score each prediction for one Other 
against the responses of the remaining Others for 
whom the predictions were not intended (4, 7). 
Such “accidental” accuracy, when averaged over 
the unintended Others, reflects the understanding 
which is general over all members of the group 
of Others rather than specific to a particular 
Other. It provides a sort of “psychological 
chance” base line. Accuracy measured in this 
way is closely related to stereotype accuracy of 
the first kind, but is also affected by the disper- 
sions of predictions and self-descriptions (3, 
p. 472). 

3. If we ask the Judge to indicate what propor- 
tion of a group will give a particular answer, or 
to mark the modal answer to be expected in a 
given group, his stereotype is expressed directly. 
This prediction can be compared with the actual 
responses of the group. It is quite possible that 
the stereotypes obtained by this direct method 
would not coincide with the stereotype obtained 
by the other two methods. Such a discrepancy 
between what might be called the Judge’s “con- 
scious” and “unconscious” concept of the group 
could be of considerable interest, and studies ob- 
taining both measures on the same Judge are 


called for. 
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In interpreting a stereotype accuracy score, a 
relation analogous to the assumed similarity-real 
similarity interaction may be noted. A person 
who is similar to the group, and who predicts that 
the group will in general give responses similar to 
his own, will almost certainly have high stereo- 
type accuracy. A person who is atypical will have 
low stereotype accuracy if he assumes that other 
people give the same responses he does. If he 
assumes that others are different from himself, he 
may have either high or low stereotype accuracy, 
depending upon what differences actually occur. 

The responses obtained in studies of social per- 
ception may be scored in many ways. The vari- 
ous scores so obtained are likely to be experimen- 
tally linked. Observed correlations are then likely 
to be artifacts of the experimental design, rather 
than relations among the traits the scores are 
named after. This type of difficulty is illustrated 
in a recent study (19). The Judges provided 
predictions of the norm, i.e., of what the average 
Other would say. This is c in Fig. 1. Empathy 
was defined as agreement between c and the true 
norm (b). Reality was defined as agreement 
between c and the average prediction of all Judges 
(©). If b agrees with ¢ on more than half the 
items (as it surely would unless the Judges as a 
group are unrealistic in the extreme), then it nec- 
essarily follows that the empathy (b-c) scores 
of individuals will be positively correlated with 
reality (€-c) scores. The actual correlation was 
.77. No meaning can be attached to this result. 
There is an empirical fact underlying it, which 
is adequately described by the degree of overlap 
between the true norm (b) and the average pre- 
dicted norm (©). Correlations among scores 
should not be interpreted in terms of higher order 
psychological constructs unless the operational 
variables are free from artifactual linkage. 


NEED FOR INDEPENDENT CRITERIA OF SOCIAL SKILL 


Many investigators have hypothesized that em- 
pathy, or accuracy of social perception, is corre- 
lated with effectiveness in interpersonal relations, 
and positive correlations have been found in sev- 
eral studies. In some research designs (e.g., 5, 
11, 18), however, a linkage between accuracy 
score and criterion gives rise to an artifactual cor- 
relation, The obtained correlation cannot be in- 
terpreted. 

For example, when sixth-grade pupils ranked 
each other sociometrically, and also estimated 
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what ratings they would receive, accuracy in esti- 
mation correlated .50 with sociometric acceptance 
(5). But, as the authors report, pupils tended to 
predict that they would be highly accepted. 
Those who are indeed highly accepted automat- 
ically obtain better accuracy scores, and this fact 
alone would account for the observed correlation. 

Another investigator (/8) asked employees to 
rate their department supervisor. Then each 
supervisor predicted the ratings given by his de- 
partment employees, and his predictive accuracy 
was scored. The correlation of accuracy with 
actual rating was .90. But we can expect a person 
to state that the group he is responsible for has 
good morale. This hypothesis alone is sufficient 
to account for the reported correlation, without 
introducing empathy or sensitivity as a construct. 
Suppose (to simplify) that the rating is on a 5- 
point scale, and that every supervisor predicts 
that his group will rate him 5 (very good). Now 
if each supervisor receives rating b, his accuracy 
score will be 5 minus b (a low score representing 
high accuracy). Accuracy would obviously cor- 
relate perfectly with the actual rating. Random 
variations in the supervisors’ predictions would 
lower the correlation, perhaps to the obtained 
value of .90. When findings can be explained 
parsimoniously as an artifact, investigators have 
the responsibility of making and reporting what- 
ever analysis is necessary to preclude such possi- 
bilities. 

Even if the predicted variate b is experimentally 
independent of the social effectiveness criterion X, 
X can have an artifactual correlation with predic- 
tive accuracy whenever b and X are correlated. 
In the study of supervisor’s sensitivity (78), the 
rating by subordinates (b) correlated .86 with 
such a second criterion, executives’ ratings Of 
departmental production (X). This is a reason- 
able finding which we do not question. But if all 
supervisors made identical self-flattering predic- 
tions (c), their accuracy (b-c) in estimating 
workers’ attitude would correlate .86 with this 
second criterion. The actual correlation was .82. 

It is possible that accuracy in perceiving an 
Other improves one’s effectiveness in dealing with 
that Other. But designs more subtle than that 
described above are required to establish such a 
relation. The response predicted by the Judge, 
on which the accuracy score is based, must not 
also be the effectiveness criterion of the study, 
nor may it be correlated with this criterion. One 
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possible design would be to measure accuracy in 
estimating the pattern of Other’s responses with 
elevation or social desirability eliminated (75, 23). 
A pupil’s knowledge as to which Others will give 
him highest sociometric ratings should not be 
artifactually related to criteria of his popularity. 
The supervisor’s or teacher’s empathy might be 
assessed by determining if he knows in which 
respects his group is best satisfied. Another de- 
vice is to use a “standard Other,” requiring every- 
one whose accuracy is tested to make predictions 
for the same individual (27) or group (1). The 
investigator should take pains to test for the pres- 
ence of artifacts by establishing whether the re- 
sponse-to-be-judged is uncorrelated with the cri- 
terion. 


THE PROCESS OF SOCIAL PERCEPTION 


The foregoing sections have emphasized the 
defects of recent conceptualizations and proce- 
dures in research on social perception. It would 
be unwise, however, to ignore the positive contri- 
butions stemming from this work: sharpened 
logical and psychological formulations and, par- 
ticularly, insights concerning real and assumed 
similarity, stereotype and differential accuracy, 
and the dangers of artifactual relationships. 
These insights will apply to research on social 
perception in any kind of situation and with any 
kind of material—for example, even if the Judge 
is allowed to report freely on observations and 
interpretations of the real-life characteristics and 
behaviors of an Other. 

Beyond this, some tentative substantive conclu- 
sions have begun to emerge, revealing what goes 
on when one person perceives another. Is the 
Judge’s perception actually determined in any 
one-to-one fashion by cues he receives from the 
Other? Or is the reaction to the Other more 
“global”? Results with the kinds of data that 
have been collected to this point strongly suggest 
that the latter alternative is closer to the truth. 
Various global dispositions of the Judge appear 
to account for much of the variance in accuracy 
scores, 

Two dispositions of this kind can be identified. 
First, Judges seem to differ significantly in their 
over-all tendencies to react favorably or unfavor- 
ably toward Others, both before and after the 
Others are observed. The Judge’s favorability 
seems to determine his predictions or perceptions 
in a way that goes well beyond any identifiable 


stimuli coming from the Other. Then, if the 
Judge likes the Other, he will predict favorable, 
socially acceptable self-descriptions by the Other 
on a questionnaire or rating scale. If the Other 
does indeed describe himself favorably, the Judge 
will be accurate. But this accuracy stems more 
from a fortuitous concomitance of general favor- 
ability sets than from any differentiated percep- 
tion of the Other. 

A second kind of disposition has been termed 
the Judge’s “implicit personality theory” (2); this 
consists of “built-in” correlations that the Judge 
consciously or unconsciously imposes on the traits, 
characteristics, or behaviors of Others. If the 
Judge is disposed to see trait B whenever he sees 
trait A, he will be accurate whenever traits A and 
B actually occur together in a given Other, and 
inaccurate when they do not. Judges have been 
found to differ in the closeness and direction of 
the associations they implicitly assume between 
traits. These differences among Judges influence 
their predictions of Others’ responses, again in a 
way that seems to go far beyond any identifiable 
stimuli coming from the Others. 

Hence, in the bulk of research to date, social 
perception as measured is a process dominated far 
more by what the Judge brings to it than by what 
he takes in during it. His favorability toward the 
Other, before or after he observes the Other, and 
his implicit personality theory, formed by his ex- 
periences prior to his interaction with the Other, 
seem to determine his perceptions. Most of the 
research that leads to this conclusion has been in 
situations where degree of acquaintance has been 
low. But this conclusion also seems to follow 
from studies where clinicians have been the judges, 
using the richest data that their diagnostic meth- 
ods can provide, and from studies where husbands 
and wives have judged each other. 

Probably we should not be surprised at this con- 
clusion. It has its analogies, of course, in visual 
and aural perception. The process of perception 
is so laden with affect, and so highly over-learned 
in the course of socialization, that the dominant 
role of global dispositions might well be expected. 

Research to test the limits of these conclusions 
readily suggests itself. We can begin with situa- 
tions where degree of acquaintance is close, and 
only a small degree of extrapolation is required. 
Under these conditions we can get a base line in 
which specific identifiable cues dominate the per- 
ception, Then we can increase the degree of 
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extrapolation and decrease the degree of acquain- 
tance. At what point will the Judge's implicit 
personality theory and over-all favorability-unfa- 
vorability begin to appear and then to dominate? 
This question will not be researchable, of course, 
until it is put in operationally defined terms. 
Nonetheless, it appears at present that we shall 
not need to go far to find the perceiver rather 
than the stimulus determining the perception. 


SUMMARY 


This paper describes several conceptual and 
methodological problems in research on inter- 
personal perception and presents suggestions for 
dealing with them. 

1. Sharpened conceptualizations of interper- 
sonal perception processes are needed. It is often 
believed that accuracy in social perception con-ı 
stitutes a general trait. But accuracy has differ- 
ent operational definitions in different studies; this 
alone is sufficient to account for the contradictory 
evidence reported. Interpersonal perception 
makes different demands on the Judge, varying 
with the degree of acquaintance between the 
Judge and the Other, and with the degree of 
extrapolation required from Input to Outtake. 
Five types of Other are identified in various types 
of research, ranging from persons in general to 
intraindividual variations. Each of the definitions 
of the problem requires separate study. 

2. In measuring accuracy of interpersonal per- 
ception, research workers should take account of 
the altered meaning of accuracy scores as real 
similarity of the Judge to the Other varies. Faults 
in previously suggested “corrections” are noted. 

3. Distinguishing between stereotype and dif- 
ferential accuracy should also make for more 
meaningful results. 

4. Many reported relationships between accu- 
racy of interpersonal perception and effectiveness 
in interpersonal relationships are contaminated by 
artifacts. Methods of avoiding artifacts are sug- 
gested. 

5. Social perception, in most research to date, 
appears to be more a global process than a one-to- 
one response to cues received from the Other. 
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SECTION VI 


Learning, 
Stress, and 


Performance 


The psychology of learning deals with the kinds of 
variables affecting the strengthening and weaken- 
ing of response tendencies. Increasing evidence 
at the human level suggests that personality at- 
tributes of subjects constitute one_such class _of 
variables. Both Spence and Franks, using differ- 
ent measures and situations, have shown in their 
articles in this section how certain personality 
characteristics inferred from test scores are sig- 
nificantly related to performance in conditioning 
and learning situations. Spence, working within 
a Hullian theoretical framework, has related anx- 
iety scores to performance in simple and complex 
learning situations. Franks, using a number of 
personality measures, has assessed the significance 
of individual differences in relation to condition- 
ability. 

Whereas Spence and Franks were interested in 
the influences of questionnaire-inferred personal- 
ity variables on behavior, Child and Waterhouse 
have focused their attention on the effects on per- 
formance of a concept, frustration, which has 
been widely discussed and variously defined by 
students of personality. Using as their point of 
departure the famous Barker, Dembo and Lewin 
study of frustration in children, Child and Water- 
house examine the problems of the characteristics 
of frustrating situations, their influence on be- 
havior, and the interpretations which can be 
made of this influence. 

Another variously defined concept, and one 
which has received even more attention from 
researchers and theorists than has frustration, 
is that of stress, Lazarus, Deese and Osler’s 
article on stress presents a lucid analysis of the 
problem of the role of the stress concept in em- 
pirical investigation. These writers emphasize the 
interacting effects on performance of such fac- 
tors as individual differences associated with sub- 
jects, the way in which the stress situation is de- 
fined and created by the experimenter, and the re- 
sponse measures to which he attends. Just as 
Child and Waterhouse contributed to the experi- 
mental study of personality through their parsi- 
monious analysis of the concept of frustration, so 
also have Lazarus, Deese and Osler performed a 
similar task in their attempt at isolating variables 
relevant to the concept of stress, 
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A THEORY OF EMOTIONALLY 
BASED DRIVE (D) AND ITS 
RELATION TO PERFORMANCE IN 
SIMPLE LEARNING SITUATIONS * 


KENNETH W. SPENCE 


A number of years ago we instituted at the 
University of Iowa a series of experiments con- 
cerned with the role of aversive motivational fac- 
tors in learning situations. In addition to the 
more usual direct manipulation of variables in- 
fluencing the motivational state of an individual, 
such, for example, as varying the intensity of a 
noxious stimulus, degree of motivation was also 
varied in these studies by employing selected sub- 
jects who differed in terms of their performance 
on a so-called scale of emotional responsiveness 
or manifest anxiety (31). That these experiments 
have aroused considerable interest among both 
clinical and experimental psychologists is readily 
evident, not only from the large number of pub- 
lished studies that have attempted either to check 
or extend our experimental findings, but alsq from 
the not infrequent critical reactions they have 
elicited. Now, while some of the criticisms di- 
rected against our studies undoubtedly have merit, 
it has been rather dismaying to discover the ex- 
tent to which many of them reflect a serious lack 
of understanding of the structure and purpose of 
the basic theoretical framework underlying the 
experiments. 

While some of the responsibility for this fail- 
ure to understand the nature and objectives of 
the theory can be assigned to the critics, I hasten 
to acknowledge that our theoretical treatments 
have been quite inadequate. The major difficulty 
is that the studies have appeared only in experi- 
mental journals in which space limitations have 
required that theoretical discussions be kept to a 
minimum. Since each article tended to limit the 
discussion to those portions of the theory relevant 
to the particular phenomena being reported, the 
theory has been presented only in a very piece- 
meal fashion. Apparently our hope that the in- 
terested reader, particularly the critic, would fa- 
miliarize himself with the theory as a whole by 
considering all of the articles has not been realized. 


* Reprinted by permission from The American 
Psychologist, 1958, Vol. 13, 131-141. 
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THEORETICAL SCHEMA 


One of the purposes of this paper is to provide 
a more systematic presentation of our basic theory, 
or, to use an expression recently introduced by 
Cronbach and Meehl (4), of the nomological net- 
work underlying our studies. Following this the 
experimental evidence bearing on the theory will 
be presented and discussed. Fig. 1 presents the 
main concepts employed, at least in so far as one 
kind of learning situation, classical conditioning, 
is concerned. At the top of the figure are shown 
the experimentally manipulated independent vari- 
ables such as N, the number of paired condition- 


N Su, ZSu, Sshock, Ra 


Sc 


Fic. 1. Diagram representing portion of theo- 
retical schema relevant to data for classical con- 
ditioning. (See text for explanation of symbols.) 


ing trials; S,, the unconditioned stimulus; 254, 
the number of previous presentations of the un- 
conditioned stimulus; R4, score on the anxiety or 
emotional responsiveness scale. The empirical 
response measure at the lower right-hand corner 
is the dependent variable. Inside the rectangle 
are represented the several theoretical concepts 
(intervening variables) and the interrelations as- 
sumed among them. The arrows indicate the 
functions relating the dependent response measure 
to the intervening variables, and the latter to the 
experimentally manipulated variables. Details of 
the portion of the theory between the intervening 
variable E and the empirical response measure 
(R,), involving such theoretical concepts as 08- 
cillatory inhibition and response threshold, have 
been omitted since our present purpose does not 
require them. It is sufficient to state that response 
frequency (R,) is some positive monotonic func- 
tion of excitatory potential E. 

That the schema presented in Fig. 1 conforms 
to the Cronbach and Meehl concept of a nomo- 
logical net is readily apparent. Thus to quote 
these writers: “The laws in a nomological network 
may relate (a) observable properties or quanti- 
ties to each other; or (b) theoretical constructs to 
observables; or (c) different theoretical constructs 
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to one another” (4, p. 290). One may readily 
find examples in our schema of each of these 
“Jaws,” or as I would prefer to call them, “rela- 
tions,” since the term “law” typically has a nar- 
rower meaning than these authors have given it. 

The theory takes its start from Hull’s basic 
assumption that the exeitatory potential, E, deter- 
mining the strength of a response is a multiplica- 
tive function of a learning factor, H, and a gen- 
eralized drive factor, D, i.e, E=HXD (9). 
We have assumed, further, that the drive level, 
D, in the case of aversive situations at least, is 
a function of the magnitude or strength of a 
hypothetical response mechanism—a persisting 
emotional response in the organism, designated 
as re that is aroused by any form of aversive 
stimulation. That is, aversive, stressful stimula- 
tion is assumed to arouse activity under the con- 
trol of the autonomic nervous system, which, ac- 
cording to some neurophysiological evidence, may 
act as an energizer of cortical mechanisms. Those 
of you who are familiar with the theoretical writ- 
ings of Miller (13) and Mowrer (14) will recog- 
nize that this mechanism is similar to one these 
writers have postulated in connection with their 
investigations of acquired motivation. Thus they 
assumed that aversive stimuli arouse a hypotheti- 
cal pain (emotional) response which, when condi- 
tioned to previously neutral stimulus events, pro- 
vides the basis for an acquired drive of fear. 

On the basis of analogy with overt reflexes to 
noxious stimulation, there were a number of prop- 
erties that could be assigned to our hypothetical 
response mechanism. Three, in particular, will 
be discussed here. The first and most obvious is 
based on our knowledge that the magnitude or 
strength of observable reflexes to noxious stimu- 
lation (e.g., the corneal reflex to an air puff, the 
GSR to an electric shock) varies directly with 
the intensity or degree of noxiousness of the stim- 
ulus. Assuming our hypothetical emotional re- 
sponse, r,, would exhibit the same property, it 
followed that the level of drive, D, present in 
classical defense conditioning would be a positive 
function of the intensity of the US. From the 
remaining portion of the theory, it could be de- 
duced that the performance level, e.g., frequency 
of CR’s would vary positively with the intensity 
of the US employed. At the time of the original 
formulation of our theory there was some evi- 
dence, in particular an experiment by Passey (15), 
which supported this implication of thẹ theory. 
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A second implication of our hypothetical mech- 
anism was based on the adaptive property of 
observable reflexes to noxious stimuli: namely, 
that such responses characteristically exhibit adap- 
tation or weakening with repeated stimulation. 
On the assumption that our hypothetical emotional 
response would behave in an analogous manner, 
it followed that, if a series of trials employing 
the US alone were given prior to conditioning, 
a lower level of D would be present during the 
subsequent conditioning than if no such adaptation 
trials were given. But if D were lower, the level 
of performance in the conditioning situation 
would also be lower following such adaptation 
trials than without them. This assumption or 
hypothesis, if you wish, is represented in Fig. 1 
by re = f(2S,). At the time of formulation of 
the theory, we found a study by MacDonald (12) 
that gave results precisely in line with this impli- 
cation. 

The third implication of our theoretical mecha- 
nism was based on the well-known fact or obser- 
vation that individuals differ in the magnitude of 
their reflex responses to a given intensity of stimu- 
lation. By analogy, again, we were led to assume 
that individuals would differ characteristically in 
the magnitude of this response, r,, to the same 
intensity of stressful stimulation. If now there 
were available some means of assessing differences 
in this emotional responsiveness of individuals, 
our theoretical schema would lead to the predic- 
tion that highly emotional subjects, as assessed 
by the measuring device, would exhibit a higher 
level of performance in aversive forms of condi- 
tioning than subjects who scored low on the 
device. 

The problem thus became one of attempting 
to develop a test for identifying individual differ- 
ences in the responsiveness of this hypothetical 
emotional mechanism. Such a test, of course, 
would have to be defined independently of the 
measures that were to be employed in testing the 
theoretical network, i.e., the measures of per- 
formance in conditioning and other learning situ- 
ations. It was in connection with this portion of 
our theory that the Manifest Anxiety or A-scale 
was developed. The idea of using a self-inven- 
tory test that would differentiate subjects in terms 
of the degree to which they admitted to possessing 
overt or manifest symptoms of emotionality was 
suggested by Taylor in a doctoral dissertation 
(30), 
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At this point I should like to make a methodo- 
logical digression and comment on a criticism 
recently made concerning this aspect of our re- 
search. One pair of critics, inspired, but unfor- 
tunately not too enlightened, by the excellent 
article of Cronbach and Meehl (4) on construct 
validity, insisted that we should have developed 
our scale for measuring D on the basis of a theory 
so that, and I quote them, “performance on it 
might be a basis for inferring drive (differences) 
independently of the outcome of subsequent ex- 
periments” (10, p. 162). While there are a num- 
ber of highly questionable methodological points 
in the arguments of these critics, I should like to 
call attention here merely to the fact that it is 
simply not true that no theorizing guided us in 
the development of the A-scale. As has just been 
recounted, we did have some very definite theo- 
retical notions as to what lay behind differences 
in level of generalized drive, D, especially in the 
case of classical defense types of conditioning. 

This theory, that D is a function of the strength 
of the emotional response made by the organism 
to the noxious stimulation, had already received 
considerable support. Its extension in the present 
instance to the individual difference variable logi- 
cally demanded that we measure the emotional 
responsiveness of individuals under comparable 
environmental conditions. Naturally, so-called 
physiological indices of emotionality, such as, for 
example, changes in pulse rate or in the GSR, 
were indicated; and we have conducted some re- 
search along this line. However, it occurred to 
Taylor that it might be both interesting and valu- 
able to investigate the possibility of making use 
of the presumed behavioral symptoms of emo- 
tionality that clinicians have described. That the 
questionnaire type of test developed turned out 
as well as it apparently has is to the credit, I 
think, of the clinical psychologists who selected 
the behavioral items as indicative of emotionally 
over-reactive individuals. 

In this connection a further comment is in order 
concerning a surprising question that was asked 
by these same critics. Thus the problem was 
posed as to what the consequences would have 
been for either the theory or the test had the 
experiments using the A-scale been negative. The 
answer to the question, at least regards the theory, 
should be obvious. The implications of the other 
portions of our theory with respect to our re- 
sponse mechanism, r,, were sufficiently well con- 
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firmed that we would have had no hesitancy 
about abandoning the A-scale as being related to 
D in our theory. Since, however, the implica- 
tions of this aspect of our theoretical net were 
confirmed, we have continued to employ the A- 
scale as one operational definition of this emo- 
tional responsiveness variable. That a more satis- 
factory scale, even one of this questionnaire type, 
can be developed, I have no doubt. Indeed, I 
would recommend that some of the time and 
energy now being squandered in the many dis- 
torted, even mendacious, criticisms that seem to 
find such ready acceptance in our current dis- 
cussion-type journals be directed at this more 
constructive task. If the main purpose of these 
attacks is to discredit and eliminate the theory, 
they will fail in this objective, for the history of 
science clearly reveals that a theory is usually 
discarded only when a better theory is advanced. 
The same goes for the constructs within a theory. 


EXPERIMENTAL EVIDENCE 


With these methodological remarks out of the 
way, let us turn now to the experimental evidence 
bearing on our theoretical schema. I shall spend 
the major part of my limited space presenting and 
discussing the findings of our eyelid conditioning 
experiments, for it was in connection with data 
from this type of learning situation that the 
schema was originally formulated. With regard 
to performance curves of conditioning, e.g., fre- 
quency curves, the implications of the theory are, 
as we have seen, that level of performance will be 
a positive function of (a) the intensity of the US, 
(b) the level of score on the A-scale, and (c) the 
intensity of an extra stressful stimulation. We 
shall take up the first two of these variables to- 
gether; since space is so limited, I shall present 
only those studies which had the largest sample 
of subjects and hence have provided the most 
reliable and stable data. 

Eyelid Conditioning Experiments —In Fig. 2 
are presented the findings of two experiments 
(one unpublished; the other, 27) one of which 
involved 120 subjects and the other 100 subjects. 
Both studies employed two levels of puff inten- 
sity, .6 lb. and 2.0 Ibs./sq. in., in the one repre- 
sented in the lower graph; .25 Ib. and 1.5 Ibs./sq- 
in., in the upper graph. Each study also involved 
two levels of emotionality (upper and lower 20% 
of subjects on the A-scale). Examination of the 
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curves in both graphs shows clearly that at each 
of the four puff intensities, the High A group 
(shown by solid curves) was well above the 
Low A group (broken curves). Statistical analy- 
sis over all of the conditioning trials revealed the 
differences were significant at the .01 level in the 
lower graph and at the .025 level in the upper.* 
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function of A-score and intensity of US. 


A second point to be noted in these data is 
relevant to the assumption that the learning or 
habit factor (H) and the drive factor (D) com- 
bine in a multiplicative manner to determine re- 
sponse strength. This assumption leads to the 
further implication that frequency curves of con- 
ditioning for different values of the anxiety vari- 
able will exhibit a gradual divergence over the 
course of training? That this prediction was 


1 Since unequal numbers of each sex were used in 
both of these studies and because women consistently 
exhibit a greater difference than men, the curves have 
been weighted equally for male and female subjects. 

2 This prediction must be qualified to the extent 
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borne out may be seen by inspecting the graphs. 
Statistical confirmation of the divergence is re- 
vealed by the fact that the trials X anxiety inter- 
action terms for both sets of data were highly 
significant (.005 and .025 levels). 

The findings with respect to the intensity of 
US variable also supported the implications of 
our theory. Thus it may be seen in both studies 
that the subjects that had the strong puff per- 
formed at a higher level than those with the weak 
puff. The divergence between the curves is also 
apparent. 

‘As an indication of the stability of our findings 
involving these two experimental variables, Fig. 3 
presents data from these same two studies along 
with some relevant data from four other investi- 
gations recently conducted in our laboratory 
(26). Shown on the ordinate of this graph are 
the percentage of CR’s given in the block of 
Trials 41-80 as a function of the intensity of the 
unconditioned stimulus employed. The upper- 
most curve in this graph represents the results 
for subjects selected from the high end of the 
A-scale; the lowest curve, subjects selected from 
the low end. The middle curve represents data 
obtained in four different experiments in which 
unselected subjects, so far as A-score, were con- 
ditioned under highly comparable conditions to 
those used with the selected subjects (i.e., similar 
visual CS and very comparable 5,-S, intervals). 
The consistency of the results from experiment to 
experiment, particularly the relation of the curve 
for the unselected subjects to those for the High 
and Low A subjects, is, I believe, quite impressive. 

In addition to the data presented in these 
graphs, four other investigations have reported 
finding that High A subjects responded at a sig- 
nificantly higher level than Low A subjects in 
eyelid conditioning (21, 22, 28, 30). One addi- 
tional study (8) also found superior performance 
by High A subjects, although the difference in 
this instance was not significant. A reasonable 
interpretation of the failure to obtain a signifi- 
cant difference in this latter study, especially to 
anyone familiar with the variability of individual 
conditioning data, is that there were only ten 
subjects in each group. 


that the frequency measure has a ceiling of 100% 
and thus may not always reflect the continued growth 
of E. This is particularly the case at high levels of 
D in which E also is high. 
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Mention was made earlier of the fact that in 
addition to the anxiety scale we have also at- 
tempted to employ a number of physiological 
measures as further operational definitions of our 
emotional reponsiveness variable. One of the 
most discouraging aspects of this work has been 
the lack of consistency, i.e., unreliability from 
day to day, of such measures. Especially has this 
been the case with the GSR, on which, unfortu- 
nately, we concentrated most of our time and 


energy. Recently, however, we have obtained 
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results (to be published) with these measures 
that are rather more promising. Using changes 
in GSR and heart rate made to a mildly noxious 
stimulus and converting the measures into a so- 
called autonomic lability score by means of a 
formula suggested by Lacey (11), two groups of 
subjects who fell in the upper and lower third 
of the distribution of such scores were subse- 
quently conditioned. Shown in Fig. 4 are the 
frequency curves of eyelid conditioning for these 
two groups of subjects. As may be seen, the 
subjects with the high autonomic lability index 
performed at a higher level than those with a 
low index. The difference is significant at the 
.02 level. 

In addition to varying performance by manipu- 
lating the A-scale and US variables, it should be 
possible to produce a higher level of condition- 
ing performance by presenting a strong extra 
stimulus, such as an electric shock, during the 
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course of conditioning. Similarly, after a sub- 
ject has experienced a strong electric shock just 
prior to conditioning, the mere threat of further 
shocks during the conditioning should arouse a 
strong and persisting emotional response that 
would raise the level of D and hence the level of 
performance. We have already published the re- 
sults of one such experiment with unselected sub- 
jects which corroborated, in part, these theoreti- 
cal expectations (25). 

Recently a further experiment (to be pub- 
lished) studying the effects of shock threat on 
High and Low A subjects was completed in our 
laboratory. Some idea of the nature of the 
findings can be gained from Fig. 5 which presents 
the frequency curves of conditioning for four 
groups of Low A subjects (20th percentile). The 
two top curves in this graph are for subjects con- 
ditioned with a relatively strong air puff (1.5 
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Ibs./sq. in.); the lower two, for subjects who 
had a weak puff (.25 Ib./sq. in.). It will be seen 
that at both puff levels the threatened group 
(solid curve) was consistently above the non- 
threatened group (broken curve) throughout the 
whole 80 trials. A similar experiment with high 
anxious subjects revealed a difference between the 
threat and nonthreat groups throughout the con- 
ditioning in the case of groups which had a weak 
puff. In the strong puff groups, however, the 
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curves for the threat and nonthreat group, after 
separating in the early trials, came together in the 
later stages of the conditioning (last 40 trials). 
This latter effect undoubtedly results, in part, 
from the ceiling imposed by the frequency 
measure. 

Space will not permit a detailed presentation 
of the experimental evidence with respect to that 
portion of our theory concerned with the assump- 
tion that the emotional response to the noxious 
US would be weaker if adaptation trials are 
given prior to conditioning. It is sufficient to 
state that the original finding of MacDonald (12), 
that such preadapted subjects exhibited a lower 
level of performance in conditioning than non- 
adapted subjects, has been corroborated by Tay- 
lor (32). The latter experimenter also found 
that conditioning performance was inversely re- 
lated to the intensity of the US employed during 
the preconditioning adaptation period. Thus the 
implications of this part of the theoretical net- 
work have also received further support. 
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and intensity of the US for subjects who score 
low on A-scale. 


The final set of conditioning data that I would 
like to present are concerned with the effect of 
level of D on differential conditioning. Without 
going into the theoretical derivation, it may be 
shown that one of the implications of our theory 
is that the higher the drive level, D, of the sub- 
jects, the greater should be the differentiation 
between the positive and the negative, i.e., non- 
reinforced, stimulus in such differential condi- 


tioning. Two studies from our laboratory have 
reported finding that, in five separate compari- 
sons, high anxious subjects showed better dis- 
crimination than low anxious subjects (2/, 23). 
Although none of the differences were significant, 
four closely approached being so. More recently 
we have investigated the effect of varying the 
level of D on differential conditioning by direct 
manipulation of the intensity of the US. The 
graph in Fig. 6 presents the findings of this study 
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(17) in terms of the frequency of CR’s given to 
the two stimuli, positive and negative. As may 
be seen, the subjects conditioned with the strong 
US not only showed a higher level of response 
to the positive and negative stimuli, but, as pre- 
dicted from our theory, the difference between 
the conditioned responses to the two stimuli was 
greater in the case of the group that had the 
strong US. Again this latter difference ap- 
proached, but did not quite reach, statistical sig- 
nificance. Unfortunately, conditioning data are 
plagued by high individual variability, produced 
in part by a few subjects who show very little or 
no conditioning. In an effort to ascertain what 
the finding would be for subjects who showed 
considerable conditioning, a separate analysis 
was made of the upper two-thirds of each of the 
two groups run in the experiment. In the case 
of these subjects discrimination was significantly 
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better for the high drive group at the .05 level. 

So much for the findings in our eyelid condi- 
tioning studies. On the whole, we believe they 
are in fair accord with our theoretical schema, 
including the portion of it that involves the A- 
scale. While not all of the results have met 
acceptable levels of significance, the fact that the 
direction of the differences in such instances has 
almost invariably been in accord with the theory 
has encouraged us to continue to hold to it. 
Attention might also be called here to the fact 
that this theoretical model, particularly the hypo- 
thetical emotional response mechanism, has also 
been quite successful in connection with a wide 
variety of other behavioral situations involving 
noxious stimulation with animals. Examples are 
to be found in the many studies cited by Miller 
(13) on the motivating and reinforcing roles of 
acquired fear in learning situations. The experi- 
ments of a number of investigators on the 
persisting motivational effects of emotionality 
aroused by electric shock on the consummatory 
behavior of rats provide yet another example (/, 
2, 18, 19). 

Complex Human Learning. —Turning now to 
our studies involving the more complex types of 
human learning, let me begin by saying that it is 
in this area that the limitations of space in ex- 
perimental journals for theoretical elaboration 
have been most unfortunate. Certainly we recog- 
nize that these treatments have been quite inade- 
quate, particularly from the point of view of dis- 
cussing the many factors that complicate efforts 
at theorizing in this area. By way of example 
let me mention two important points that need 
to be recognized but, unfortunately, have not 
always been so. 

First, it should be realized that in order to 
derive implications concerning the effects of drive 
variation in any type of complex learning task, it 
is necessary to have, in addition to the drive 
theory, a further theoretical network concerning 
the variables and their interaction that are in- 
volved in the particular learning activity. It is 
perhaps unnecessary to point out here that theo- 
retical schemas for such types of learning are 
as yet in a very primitive state of development, 
indeed almost nonexistent. As a consequence of 
this, one has considerable difficulty in drawing 
conclusions about the motivational part of the 
new, combined theory from supposedly negative 
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findings, for the defect may be in the part of the 
network specifying the action of the variables in 
the complex learning situation. 

The second point is that our theory of the 
mechanism underlying D was developed in con- 
nection with experimental situations involving 
some form of noxious stimulation. Complex hu- 
man learning tasks, on the other hand, typically 
do not involve the use of a noxious stimulus. 
Whatever stress is present in these situations is 
usually produced by instructions that aim to 
create in the subject the desire or need to make 
as good a showing as possible. While it is true 
that this stress may be greatly augmented by 
introducing failure or punishment into the situa- 
tion, so far as the usual type of human learning 
experiment is concerned, the question as to 
whether High A subjects would be more emo- 
tional than Low A subjects, and hence have a 
higher D level, is a moot one. In this connec- 
tion two alternative subhypotheses have been pro- 
posed: (a) the chronic hypothesis: that High A 
subjects react emotionally in a chronic manner 
to all situations, whether stressful or not; and 
(b) the emotional reactivity hypothesis: that 
High A subjects have a lower threshold of emo- 
tional responsiveness and react with a stronger 
emotional response than Low A subjects to situa- 
tions containing some degree of stress (16, 20, 
25). As may be seen, according to the first of 
these hypotheses, mild nonthreatening situations 
would produce a differential drive (D) level in 
subjects scoring at extremes of the scale; whereas 
according to the second, there would not be a 
difference. These two examples are sufficient, 
I believe, to point up the fact that the problems 
involved in the extension of the theory to these 
more complex types of learning are quite formi- 
dable and that at this stage there necessarily 
must be a considerable amount of trial and error 
in our theorizing. 

Now it will be recalled that the theoretical 
schema presented in Fig. 1 assumed that in clas- 
sical conditioning habit strength to but a single 
response was established to the CS. In this cir- 
cumstance, as we have seen, an increase in drive 
level implied an increase in response strength. 
In more complex, selective learning tasks, on the 
other hand, there are, typically, a hierarchy of 
competing response tendencies. Actually most 
of the complex learning situations employed with 
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humans involve a number or sequence of such 
response hierarchies which involve competing re- 
sponses, e.g., a number of choice points in the 
maze, whether verbal or spatial. To show what 
the implications of variation of drive level will 
be in such competing response situations, let us 
begin by considering the simplest conceivable 
case: one in which there is but a single response 
hierarchy involving two alternative responses. 
The single choice point maze involving turning 
left or right is one example of such a situation. 
If now the habit strength of the correct to-be- 
learned response is, at the beginning of the learn- 
ing, somewhat stronger than that of the incorrect 
response, it may be shown that the higher the 
drive level, D, the greater will the difference 
between the competing excitatory potentials be 
and neglecting all oiher considerations, the higher 
should be the percentage of correct responses at 
the start of learning, the sooner should the learn- 
ing criterion be attained, and the smaller should 
be the total number of errors.* 

The reverse situation, that in which the correct 
response is at the outset weaker than the incor- 
rect one, is, from the theoretical viewpoint, even 
more complex. In this instance the stronger the 
drive, the greater will be the per cent choice of 
the wrong response, or, in other words, the poorer 
will be the performance at this initial stage. But, 
as training proceeds, sooner Or later the habit 


3 As discussed in my Silliman Lectures (21), there 
are a number of other considerations that need to be 
taken into account in extending the theory to such 
competing response situations. Thus the particular 
composition rule (law) assumed in these lectures to 
describe the manner in which the competing responses 
interacted with each other led to the implication that 
the percentage of occurrence of the competing Te- 
sponses is a function, not only of the magnitude of 
the difference between the competing Es, but also of 
their absolute level above the threshold L. As a con- 
sequence in the low range of E values, there may 
actually be an inverse relation between performance 
level (per cent choice of the response with stronger 
E) and the level of drive. Still other considerations 
involve whether habit strength (H) in learning situ- 
ations is or is not assumed to be dependent on the 
reinforcer and whether drive strength (D) determines 
the inhibitory factor (J,). Different combinations of 
these alternative assumptions, including even other 
possible composition rules, lead to different behavior 
consequences. Critical evaluation of the different con- 
ceivable theoretical models will require considerably 
more empirical data obtained under a wide variety of 
experimental conditions than is now available. 
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strength of the correct, reinforced response will 
overtake that of the wrong, unreinforced response 
and from this point on the per cent choice of the 
correct response should in general be higher for 
the high drive group than for the low drive 
group. In other words, the performance curves 
should be expected to cross. Precise predictions 
about the total number of errors, number of trials, 
etc., in this situation will depend to a considerable 
extent upon the particular functions and param- 
eter values assigned to the assumed habit and 
inhibitory factors. Actually we have never got 
around to working out in detail the implications 
of the various possibilities for the total learning 
period even in this simplest case. 

Recalling now that such a learning task as the 
serial verbal or spatial maze involves a number 
of such competing response hierarchies, we see 
that the problem of predicting the effect on per- 
formance of variation of the drive in such situa- 
tions becomes even more complicated. On the 
assumption that anticipatory and perseverative 
associative tendencies would develop in such a 
manner as to make the incorrect response the 
stronger in the case of many of the choice points 
of a maze, it was hoped that it would be possible 
to demonstrate that high drive (i.e., High A) 
subjects would perform more poorly in such 
serial learning situations than low drive (i.e., 
Low A) subjects. Two experiments, one with 
a verbal form of maze (35) and one using a 
finger maze (5) actually did provide results in 
agreement with this theoretical expectation, 
However, as was pointed out at the time, there 
was a serious discrepancy between the theory and 
the obtained results in these studies in that the 
anxious subjects made more errors at all but one 
of the choice points in both studies. In view of 
the ease of learning many of these choice points, 
and hence evidence for little or no strong com- 
peting response tendencies, the theory would have 
led us to expect that the High A subjects would 
have made fewer errors on them than the Low A 
subjects. Obviously the theory was wrong in 
some respect, but just in what way—an incorrect 
assumption or failure to include an important 
relevant variable—was not clear. 

At this point in our work we realized that such 
serial learning tasks are, for a variety of reasons, 
quite unsatisfactory. Among the most important 
from our viewpoint was the fact that one has 
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little or no knowledge of the relative strengths of 
the competing responses in each of the hier- 
archies. Accordingly we abandoned this type 
of situation and attempted to develop learning 
tasks in which it would be possible to specify or 
manipulate in some known manner the relative 
strengths of the competing responses in each hier- 
archy. Probably the chief value of these earlier 
experiments is that they did point up the fact 
that a higher anxiety score (and hence possibly 
a higher drive level) does not necessarily always 
lead to a higher level of performance. 

Among the types of learning problems that we 
turned to was paired-associates learning. This 
type of learning task may be conceived as con- 
sisting of the formation of a set of more or less 
isolated S-R associations or habit tendencies. In 
one type of list, which we have referred to as a 
noncompetitive list, an attempt is made to isolate 
as much as possible the paired items by mini- 
mizing the degree of synonymity or formal simi- 
larity among both the stimulus and response 
words, As learning proceeds and the habit 
strengths of the stimulus words to their paired 
response words increase, high drive subjects 
should, according to our theory, perform at a 
higher level than low drive subjects. An impor- 
tant condition in this derivation is that the asso- 
ciative connections between each stimulus word 
and the nonpaired response words are lower than 
that to the paired response word. 

Two lists of this type have been employed. In 
one the associative connections between the 
paired words were initially zero or at least very 
low. In this type of list it would be predicted 
that there would be little or no difference between 
high and low drive subjects at the start of learn- 
ing, but that as learning progressed the curve of 
correct responses would diverge, that for the high 
drive group being the higher. Using nonsense 
syllables of low association value and low intralist 
similarity, Taylor has reported two experiments 
in which this type of list was employed (33, 34). 
The lower pair of curves in Fig. 7 present the 
data from one of these studies (34). Both 
curves, it will be observed, began at a very low 
level with the curve for the High A group (solid 
line) rising above that for the Low A group 
(broken line). An unpublished study from our 
laboratory employing nonassociated paired adjec- 
tives has given similar results, although the su- 
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periority of the High A over the Low A subjects 
was significant only on a single tailed hypothesis. 

The second type of noncompetitive list differs 
from the first in that the associative strengths of 
the paired words are, as the result of past ex- 
periences, considerably above zero. Under this 
condition it would be predicted that the perform- 
ance curves would, on the first anticipation trial, 
be considerably above 0% and that the curve for 
the high drive group would be above that for the 
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low drive group. Employing paired adjectives 
that had been scaled by Haagen (7) as having 
high “closeness of association” values, two studies 
(24, 29) have reported results which support 
these implications. The upper pair of curves in 
Fig. 7 shows the findings of one of these studies 
(29). As may be seen, the initial level of per- 
formance was well above 0 and the High A sub- 
jects started out and continued at a higher level 
than the Low A subjects. On the other hand, 
a recently completed doctoral dissertation (6) 
using this type of list failed to obtain results in 
accord with the theory. There was little or no 
difference between the two groups at any stage 
of practice. 

In contrast to these noncompetitive type lists 
we have also designed a competitive list which 
includes some paired items in which the initial 
habit strength of the stimulus word to call out the 
paired word is weaker than the habit strengths 
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to one or more other response words in the list. 
In the case of these items it would be predicted 
from our theory that high drive subjects would 
at the start of learning perform more poorly than 
low drive subjects. Here again we should have 
emphasized that the theory of paired-associates 
learning has as yet not been developed sufficiently 
to predict what will happen beyond the first few 
trials, and it would have been more appropriate, 
as far as implications for our drive theory are 
concerned, if we had used at most only the data 
from the first four or five trials. Precise predic- 
tions concerning performance beyond this point 
must await the development of a more adequate 
theory of the variables determining the weaken- 
ing of these stronger, incorrect responses in 
paired-associates learning. Two published studies 
(24, 29) and one doctoral dissertation (6) have 
reported data with respect to the implication of 
our theory for this type of list; while all three 
found, as predicted, that the High A subjects 
were inferior to Low A subjects in the first four 
trials, none of the results was statistically signifi- 
cant. However, the implication of the theory 
that there would be an interaction between level 
of A-score and performance on the two kinds 
of lists, competitive and noncompetitive, was con- 
firmed. 

Summarizing the results with these paired-asso- 
ciates lists, I would say that the batting average 
of our theory is fairly high but by no means per- 
fect. It is clearly evident from the data that 
differences in level of A-score (and hence level 
of D), if it is a factor determining performance 
on such tasks, is a relatively unimportant one. 
Certainly individual differences in verbal learning 
ability play a much more decisive role. More- 
over, there are as yet many factors that play im- 
portant roles in such complex behavior situations, 
about which we have as yet little or no knowl- 
edge. Among those of a motivational nature is 
the type of task-irrelevant response that Child and 
his group have studied (3). We think of these 
interfering responses as being elicited by drive 
stimuli (sp), and hence they would be incorpo- 
rated in a more complete motivational theory of 
learned behavior. On the basis of evidence in 
the literature and some recently completed studies 
of our own, we believe this factor is especially 
important when shock is introduced into verbal 
learning situations. 


I should like to conclude this presentation by 
stating very briefly the purpose of such theo- 
retical schemas as has been presented here. As 
I conceive them, their primary function is to 
provide for the unification of what, without the 
theory, would be a multiplicity of isolated or 
unconnected facts and laws. Thus, in the present 
instance, such phenotypically different phenom- 
ena as behavior in eyelid conditioning under vari- 
ous stimulus conditions, degree of emotionality 
as revealed by a personalty questionnaire and 
physiological measures, and such opposite per- 
formance differentials in paired-associates tasks 
as just described have been interrelated by means 
of the theory. That much work, both of a 
theoretical and experimental nature, remains to 
be done in this area of behavior study is clearly 
revealed by the many gaps and deficiencies in 
the present attempt. It is my firm belief, how- 
ever, that progress in the development of this, 
as in any other scientific field of knowledge, is 
greatly facilitated by such theoretically oriented 
research endeavors. 
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CONDITIONING AND PERSONALITY: 
A STUDY OF NORMAL AND 
NEUROTIC SUBJECTS * 


CyrıL M. FRANKS * 


Pavlov’s theory of cortical functioning (24, 25, 
26) emphasizes two basic cortical processes, 
excitation and inhibition. The term inhibition is 
unfortunately used in at least three different 
senses. There is the general psychiatric usage 
of the word, as applied to the introverted, with- 
drawn personality; there is the neurophysiological 
use of the word; there is the Pavlovian usage. It 
must be stressed that cortical inhibition in the 
Pavlovian sense (which is how it is used in the 
present paper) should be associated with the 
absence of behavioral inhibitiom in the psychiatric 
sense. Pavlov (24, 25, 26) has produced ample 
evidence to suggest that both excitation and inhi- 
bition are positive and molar cortical processes. 
However, any further assumptions about the 
physiological nature of these two processes would 
be largely speculative; it must be sufficient for the 
present to regard them both as hypothetical con- 
structs (23) and not merely as intervening 
variables. 

During his experiments, Pavlov observed large 
individual differences in the behavior of his dogs. 
Certain dogs (which he called the excitatory 
kind) developed stable positive conditioned re- 
sponses with ease and retained these responses for 
a long time during extinction. Other dogs (which 
he called the inhibitory kind) developed positive 
conditioned responses very poorly, which, once 
formed, were easily disrupted and soon extin- 
guished. After a discussion of experimentally 
produced “neuroses” in animals and in man, 
Pavlov wrote as follows: 

It has been seen that the above-mentioned 
method may lead to different forms of disturb- 
ance, depending on the type of nervous system 
of the animal. In dogs with the more resistant 
nervous system it leads to a predominance of ex- 


* Reprinted by permission from The Journal of 
Abnormal and Social Psychology, March, 1956, Vol. 
52, No. 2, 143-150. 

1This research has been aided by a grant from 
the Central Research Fund of the University of 
London. 
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citation; in dogs with the less resistant nervous 
system, to a predominance of inhibition. So far 
as can be judged on the basis of casual observa- 
tion I believe that these two variations in the 
pathological disturbance of the cortical activity 
in animals are comparable to the two forms of 
neurosis in man—in the pre-Freudian terminol- 
ogy neurasthenia and hysteria—the first with ex- 
aggeration of the excitatory and weakness of the 
inhibitory process, the second with a predomi- 
nance of the inhibitory and weakness of the 
excitatory process (24, p. 397f.). 

Pavlov never followed up this observation, and 
it has been neglected ever since. From this ob- 
servation, however, it is possible to predict that 
neurotics of the neurasthenic type should form 
conditioned reflexes rapidly and that these re- 
flexes should be slow to extinguish, whereas neu- 
rotics of the hysteric type should form condi- 
tioned reflexes slowly and that these reflexes 
should extinguish readily. Since the term neuras- 
thenic is ill defined and rarely used in contem- 
porary psychology, it would seem better to exam- 
ine both neurasthenia and hysteria more care- 
fully in an attempt to discover how they differ 
rather than to test the above prediction immedi- 
ately. Both neurasthenia and hysteria are gen- 
erally regarded as forms of neuroses, so it is 
along a dimension other than neuroticism that 
differences must be sought. Pavlov himself has 
implicitly suggested the dimension of excitation— 
inhibition, the neurasthenic being at the excitation 
end of the continuum and the hysteric at the in- 
hibition end. 

In seeking another dimension, Eysenck pointed 
out (1, p. 52f.) that Jung (21, p. 421) suggested 
that the characteristic neurosis of the extravert 
is hysteria, whereas that of the introvert is psy- 
chasthenia. Jung also emphasized the independ- 
ence of introversion and neuroticism, and this 
independence has been experimentally confirmed 
by Eysenck and his co-workers (J, 2, 3). They 
have demonstrated the existence of two orthog- 
onal factors, introversion-extraversion and neu- 
roticism, and’ have shown that anxiety neurotics, 
obsessive-compulsives, and depressives typically 
have high scores on tests of neuroticism and high 
scores on tests of introversion (with correspond- 
ingly low scores on tests of extraversion). To 
these introverted neurotics Eysenck has given the 
name of dysthymics. He has also shown that 
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hysterics and psychopaths have high scores on 
tests of neuroticism and high scores on tests of 
extraversion (with correspondingly low scores on 
tests of introversion). 

The neurasthenic type of neurotic would seem 
to be subsumed under the heading of dysthymia 
and the hysteric under the heading of hysteria 
(used here in Eysenck’s sense to include both 
hysteria and psychopathy), Furthermore, an ex- 
amination of patients usually included in these 
categories suggests that dysthymia is related to 
Pavlov’s concept of excitation and hysteria to 
inhibition. Thus, the dysthymic usually presents 
overt physical and mental characteristics of anx- 
iety, obsessions, compulsions, or ruminations. He 
may be irritable, introspective, ill at ease with 
people, very much aware of himself and of others. 
He is often overconscientious and highly sensi- 
tive to his environment. He may be agitated, 
hyperactive, and tense, or he may be reflective 
in his thought, overcautious and hesitant. All 
these characteristics are consistent with a pre- 
sumed state of cortical excitation. 

The hysteric group presents very different char- 
acteristics. They may develop “escape mecha- 
nisms” such as fugues, amnesia, or gross conver- 
sion symptoms. They tend to be insensitive, 
irresponsible, unreliable, and little concerned 
about other people. Most of these characteristics 
seem to involve some form of dissociation and 
may hence be reasonably conceived as associated 
with a state of cortical inhibition. 

Pavlov’s original observation may, therefore, 
be slightly revised in the form of the following 
hypothesis: 

Neurotics of the dysthymic type form condi- 
tioned reflexes rapidly, and these reflexes are 
difficult to extinguish; neurotics of the hysteric 
type form conditioned refiexes slowly, and these 
reflexes are easy to extinguish. 


PREVIOUS STUDIES 


There have been many experimental studies on 
the conditionability of neurotics (e.g., 5, 31). 
Most workers have found that “neurotics” as a 
group condition better than normals. They have 
usually made no attempt, however, to classify the 
neurotics into dysthymics and hysterics. Since 
more patients are of the dysthymic variety than 
the hysteric, these findings are probably more 
accurately interpreted as indicating that dys- 
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thymic neurotics condition better than normals. 
More recently, Taylor and Spence (35) compared 
the conditionability of anxious neurotics with 
“other neurotics” and failed to find any statisti- 
cally significant differences. It is significant, how- 
ever, that the diagnoses represented in the general 
category of “other neurotics” included reactive 
depression, alcoholism, psychopathy, character 
disorders, obsessive compulsions, compulsions, 
etc. In other words, this classification of “other 
neurotics” included large numbers of both the 
dysthymic and hysteric types of neurotics. Thus, 
the relative conditionability of dysthymic as com- 
pared with hysteric neurotics remains an open 
question. 

Many studies, however, have been made of the 
conditionability of one kind of dysthymia, the 
anxiety states. Using the PGR, for example, it 
has been shown that anxious subjects condition 
better than normals (28, 36, 37), and similar 
results have been obtained with the eyeblink re- 
flex (78, 29, 30, 32). 

Thus, since dysthymics and hysterics appear 
to differ along the dimension of introversion- 
extraversion, and since it seems reasonable to 
associate dysthymia with excitation and hysteria 
with inhibition, it is probable that excitation is 
associated with introversion and inhibition with 
extraversion. If this argument is sound, then 
conditionability is more a function of introver- 
sion-extraversion than of neuroticism. None of 
the studies cited bears on this hypothesis since 
they were concerned with the conditionability of 
only one group, the anxiety states. Such Ss 
would form conditioned responses more readily 
than nonanxious Ss either because they were more 
neurotic or because they were more introverted. 
A crucial experiment would have to examine the 
conditionability of both dysthymics and hysterics, 
and also of normals. If the ease of condition- 
ability of the dysthymics is related to their neu- 
roticism, then hysterics and dysthymics should 
condition equally well, and the normals should 
condition less well than either neurotic group. 
If, however, the ease of conditionability of the 
dysthymics is related to their degree of introver- 
sion (i.e., to their degree of excitation), then it 
follows that the hysteric group would condition 
the least, the dysthymic group the most, and the 
normals in between. Furthermore, if condition- 
ability is related to introversion-extraversion, then 
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within the normal group there should be no cor- 
relation between conditionability and a test of 
neuroticism, but a significant negative correlation 
between conditionability and a test of extraver- 
sion. 


THE PRESENT EXPERIMENT 

Twenty dysthymic patients, 20 hysteric pa- 
tients, and 20 normals were tested in a specially 
constructed soundproof conditioning laboratory 
(7). The unconditioned stimulus (US) was a 
puff of air, the conditioned stimulus (CS) a tone 
administered through a pair of headphones, and 
the unconditioned (UR) and conditioned (CR) 
responses were both eyeblinks and PGR changes. 
Partial conditioning was used, each S being given 
30 reinforcements, interspersed with 18 test trials, 
and 10 extinction trials. All conditioning was 
carried out in one session, taking approximately 
half an hour per S. The 20 normal Ss were re- 
tested after an interval of 14-21 days. All Ss 
were given personality questionnaires defining 
neuroticism and introversion-extraversion, tOo- 


gether with Taylor’s Manifest Anxiety Scale (33) 


and the Maudsley Medical Questionnaire 
(MMQ). 

Subjects —All Ss were aged between 17 and 47 
years. Although the normal group was signifi- 
cantly younger than the two neurotic groups, the 
neurotic groups did not differ significantly in age. 
The groups were not matched for sex since there 
is some evidence (22) to suggest that there is no 
relationship between sex and eyeblink condition- 
ability, and there are no reasons for suspecting 
that sex differences exist with respect to PGR 
conditioning. According to Hilgard and Marquis 
(19, p. 303), there is little correlation between 
intelligence and conditionability; consequently, no 
attempt was made to match the three groups on 
this variable other than to exclude any Ss of low 
intelligence, the criterion being a raw score of 
less than 29? on the Raven Matrices nonverbal 
test of intelligence (27)- All three groups happen 
to be significantly different from each other in in- 
The data relating to age, Sex, intelli- 


telligence. 
re presented in Table 1. 


gence, and extraversion al 


2In terms of the Wechsler-Bellevue scale, this 
would mean very approximately that Ss with an IQ 
below 90 would be excluded. However, concern here 
was not with an S’s IQ, but with his absolute amount 
of intelligence, his raw score on the test concerned 


irrespective of age, Sex, etc. 
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The 20 normal subjects. The normal Ss con- 
sisted of Occupational Therapy students, student 
male nurses, and Occupation Centre student 
teachers. All were naive with respect to psychol- 
ogy, and all were volunteers. They were re- 
quested to avoid discussing the experiment with 
friends. Twenty-three Ss were originally tested, 
of whom one was rejected because she blinked 
excessively; one was rejected because he did not 
know sufficient English to answer the personality 
questionnaires, and one was rejected because he 
had recently been discharged from a mental hos- 
pital. 
The 40 neurotic subjects (20 dysthymics, 20 
hysterics). All the neurotic Ss were residential 
or outpatients at the Maudsley Hospital. No pa- 
tient was used who had any evidence or history of 
psychotic features, brain injury, or epilepsy, or 
who had received any form of psychosurgery. 


TABLE 1 


Data on AGE, SEX, INTELLIGENCE, AND 
EXTRAVERSION SCORES 


N = 20 in each group 


Sex 
Age in (No. of | Matrices R 
Years Sub- Raw Score Score 
Group jects) 
Mean | SD | M | F | Mean | SD Mean | SD 
Hysterics 28.8 | 7.1 7113} 44.6 |9.0| 49.6 | 11.2 
Normals 24.3 | 4.9] 6| 14] 57.0 | 3.0 40,5 | 11.6 
Dysthymics | 31.3 | 8.2 15 | 5| 50.2 | 6.7) 20.6 6.3 


No S was used who had any physical treatment 
such as ECT or insulin during the previous eight 
weeks or who had been recently receiving such 
medication as sodium amytal. (All Ss were de- 
nied any drugs for the preceding 24 hours.) 

The Ss were included in the hysteric group if 
the psychiatrist concerned could diagnose them 
as having one or more of the following charac- 
teristics: hysterical personality, conversion symp- 
toms, hysteria, psychopathic personality, psycho- 
pathic features. Furthermore, the psychiatrist 
was requested to select those Ss with a minimum 
of obsessive-compulsive features, manifest anx- 
iety, or depression. 

The Ss were included in the dysthymic group if 
they could be diagnosed as having one or more of 
the following characteristics: manifest anxiety, re- 
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active depression, obsessive-compulsive features. 
They also had to show a minimum of hysteric and 
psychopathic characteristics. 

One final criterion was used in selecting the two 
neurotic groups. All hysteric patients with a score 
of less than 33 on Guilford’s R Scale (13) were 
rejected, and all dysthymic patients having an R 
score of above 32 were also rejected. This cri- 
terion was included since it has been shown (3, 
17) that Guilford’s R Scale is one of the best 
available measures of extraversion. Thus, the 
two neurotic groups were deliberately chosen to 
minimize overlap along the dimension of intro- 
version-extraversion. By this means, any differ- 
ences in conditionability between introverted and 
extraverted neurotics should become more readily 
apparent with a relatively small number of cases. 
The means and SD's for the three neurotic and 
normal groups are presented in Table 1. Al- 
though all three groups are significantly different 
from each other with respect to their R scores, 
the normal group tends to resemble the hysterics 
more than they do the dysthymics in this respect. 

Twenty-eight hysterics patients were originally 
tested, of whom eight had to be rejected because 
seven had R scores below 33, and because of a 
major error by the experimenter during condition- 
ing with one. The final group of twenty hys- 
terics were diagnostically constituted as follows: 
eight, hysteria; two, hysterical features in an im- 
mature personality; two, conversion hysteria; one, 
anorexia nervosa hysteria; one, hysteria plus mixed 
neurotic reactions; one, hysterical palsy; three, 
sexual psychopathy; one, chronic alcoholism in a 
Psychopathic personality; one, delinquency in a 
psychopathic personality. 

Twenty-seven dysthymic patients were originally 
tested of whom seven had to be rejected for the 
following reasons: four had R scores above 32; 
two had changed diagnoses; one was initially 
sensitive to the sound, giving both a blink and a 
PGR to a sound stimulus. The final group of 
twenty dysthymics were diagnostically constituted 
as follows: eleven, mainly obsessional and com- 
pulsive features; five, mainly anxiety features; 
four, mainly depressive features. In most cases 
the diagnoses were mixed, including more than 
one of these features, 

The Personality Questionnaires —Guilford’s 
scales STDCR (13) and the Guilford-Martin 
scales G and A (16) were typed on cards, one 
item to each card, and the S requested to “mail” 
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each card into one of three slotted boxes labeled 
YES, NO, and ?. The 237 cards were presented 
to the Ss in random order. The seven traits are 
described in the test manuals as follows: 


S—-social introversion: shyness, seclusiveness, etc. 

T— thinking introversion: an inclination toward 
meditative or reflective thinking, self analysis, 
etc. 

D—4epression: habitually gloomy, pessimistic with 
guilt feelings, etc. 

C—cycloid disposition: strong emotional fluctua- 
tions, emotional instability, etc. 

R—thathymia: happy-go-lucky, carefree, lively, 
impulsive, etc. 

G—general activity: a general pressure to engage 
in overt activity, etc. 

A—ascendancy: leadership qualities; social as- 
cendancy, etc. 


The Guilfords identified these traits by factor 
analysis studies (14, 15). There is, however, 
some evidence to suggest that C and D are largely 
measures of the factor of neuroticism and that R 
is a good measure of the introversion-extraversion 
dimension, a high R score characterizing the ex- 
travert end of the dimension (3, p. 106ff.; 17). 

Taylor’s Manifest Anxiety Scale and the MMQ 
were also typed on cards and the S requested to 
“mail” each card into one of two boxes labeled 
TRUE and FALSE. Taylor’s scale normally con- 
sists of 50 anxiety items and 175 buffer items, but 
here the original 175 buffer items were omitted 
and in their place were substituted the 48 items 
of the MMQ. These 98 items were administered’ 
in random order. 

Taylor’s Anxiety Scale has been used extensively 
in conditioning studies (e.g., 29, 30, 32) and is 
supposed to discriminate the anxious from the 
nonanxious Ss. The MMQ consists of a 40-item 
neuroticism scale together with 18 “Lie” items. 
It is discussed in detail by Eysenck (2, p. 94f,.), 
and there is little doubt that, given good subject 
motivation, it discriminates well between neurotics 
and normals. 

Conditioning Sequence, Apparatus, and Method 
of Recording.—Before the conditioning session 
was begun, each S was given three tone stimuli, 
followed by three air puff stimuli (not paired with 
the CS) and then three more tone stimuli. Only 
Ss who did not give PGR or eyeblink responses 
to the last three tone stimuli were included in the 
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conditioning study, The purposes of these trials 
were to eliminate those Ss who showed any evi- 
dence of pseudo-conditioning (77, 12) or original 
sensitivity to the tone. 

The conditioning session consisted of 30 rein- 
forced trials, randomly interspersed with 18 test 
trials (called acquisition trials). The US con- 
sisted of an air puff, lasting 500 msec., delivered 
at a pressure of 65 mm. of mercury from a 2.5 
mm. internal diameter polythene tube at approxi- 
mately 2 cm. from the right eye. The CS was a 
pure tone, delivered to both ears through a pair 
of padded high quality earphones at a frequency 
of 1,100 cycles per sec. for a duration of 800 
msec. after the tone was presented. The time 
intervals were accurately controlled by means of 
an electronic timer. Finally, each S was given 
ten consecutive test trials (called extinction trials). 

The air was supplied from a compressed air 
cylinder and puffed into the eye by means of an 
electronically operated gas valve. The polythene 
tube ran from the gas valve to a pair of plain 
glass spectacles worn by the S. The right spec- 
tacle lens had a small aperture into which the 
plastic tube was attached. The eyelid movements 
were recorded by means of a photoelectric cell 
attached to the same lens as the polythene tube. 
This photoelectric method of recording the eye- 
lid movements has been described in detail else- 
where (10). It is particularly suitable for work- 
ing with patients since it does not require any 
electrodes or artificial eyelashes to be attached 
directly to S, and S does not have to keep his 
head rigidly still. Briefly, it consists of a small 
photoelectric cell and a linear amplifier, the am- 
plifier EMF being used to drive the pen of a 
recording milliammeter. The milliammeter was 
equipped with several channels so that eyelid 
movements, PGR trace, and the occurrence of the 
CS and the US all appeared on the same record. 

Procedure.—The Ss were told that they were 
to be tested in a quiet room where measures could 
be made of how well they were able to relax under 
various conditions. As detailed elsewhere (6) 
the conditioning laboratory can be partitioned 
into two by means of a sliding curtain. The S 
was seated comfortably in an armchair in one 
half of the laboratory, facing a small table so 
arranged that his field of vision was largely con- 
fined to the plain walls of a booth constructed 
around this table. His head rested against an 
adjustable padded headrest, and his feet were sup- 
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ported by a footstool. The E and the apparatus 
occupied the other half of the room. Once the 
test was ready to begin, the curtains could be 
drawn together, leaving a small gap just sufficient 
for E to see S (but S was unable to see E). 

The spectacles, headphones, and PGR elec- 
trodes were fitted on S; and after a suitable inter- 
val, S’s hearing threshold was measured for the 
frequency of 1,100 cycles per second (both ears 
simultaneously). Any S whose hearing threshold 
was below —20 db was excluded from the study. 
In the middle of the booth was a small red light 
(6.3 v., 0.5 amp.) which S was instructed to look 
at whenever it was on. The red light was switched 
on five to ten seconds before any stimulus was 
delivered and switched off a few seconds after- 
ward. It was found that this procedure reduced 
strain in § and ensured that his eyes were open 
during the critical period just before any stimulus 
was delivered without having to tell him to keep 
his eyes open. It also reduced eye movements 
and spontaneous blinking during this critical pe- 
riod. The nine sensitization and pseudo-condi- 
tioning stimuli were then given, followed by the 
conditioning and extinction session, 

The S was then asked such questions as, “Did 
you find the air puff bearable or unpleasant?” and 
“Did you feel sleepy?” Finally, he was requested 
to sort the personality items. The intelligence 
test had been given to most Ss upon an earlier 
occasion. For the 20 normal Ss, the whole pro- 
cedure, except the intelligence testing, was re- 
peated after an interval of 14-21 days. 

A CR was recorded whenever the record showed 
a deflection of 1.27 mm. (.05 in.) or more during 
a latency of between 156 and 625 msec. after the 
onset of the CS for the eyeblink and a latency of 
between 0.5 and 8.0 seconds after the onset of 
the CS for the PGR. 


RESULTS 


Table 2 gives the mean number of conditioned 
responses during the 18 acquisition test trials and 
the ten extinction test trials for both PGR and 
eyeblink reflexes in all three groups. 

From the data presented in Table 2 it may be 
concluded for both eyeblink and PGR reflexes 
that: 

1. Dysthymics give significantly more CR’s 
than hysterics both for the acquisition trials and 
for the extinction trials (one-tailed ¢ test: 
p<.0005 for the eyeblink acquisition and ex- 
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TABLE 2 
MEAN NUMBER OF CR’s Given BY EACH GROUP 


N = 20 per group; maximum acquisition score = 18, 
maximum extinction score = 10 


Eyeblink PGR 
Group Acquisition | Extinction | Acquisition | Extinction 
Mean | SD| Mean | SD | Mean | SD | Mean | SD 
Hysterics 67 |59| 2.7 |2.7| 2.3 |27| 0.8 | 1.7 
Normals 79 |54| 3.7 12.7] 24 13.3) 0.6 | 1.7 
Dysthymics | 12.4 [4.1] 63 |3.2| 54 |40| 21 | 2.3 


tinction trials; p < .005 for the PGR acquisition 
trials; and p < .025 for the PGR extinction trials). 

2. The normals give significantly less CR than 
the dysthymics for both acquisition and extinc- 
tion trials (one-tailed t test: p< .0025 for the 
eyeblink acquisition trials; p< .01 for the eye- 


@ =1 Dysthymic 
(ae =1 Hysteric 


01234567 8|9 1011 12 13 14 15 16 17 18 
Number of Acquisition CR's 


Fis. L Number of hysterics and dysthymics 
misclassified using the frequency of eyeblink 
conditioned reflexes as a measure. 


blink extinction trials; p < .025 for the PGR ex- 
tinction trials). 

3. When the normals are compared with the 
neurotics as a combined group, there are no sig- 
nificant differences in the number of CR’s pro- 
duced. 

The differences are not so marked for the PGR 
as for the eyeblink reflex, probably because of 
the low intensity of the US. It was noticed that 
an air puff at a pressure of only 65 mm. of mer- 
cury was in several cases insufficient to produce 


CONTEMPORARY RESEARCH IN PERSONALITY 


even an unconditioned PGR after the first few 
trials. 

Figure 1 shows the histograms for the acqui- 
sition CR’s for the eyeblink reflex. If eight CR’s 
are chosen as the cutoff score, then the classifica- 
tion error is 20 per cent for the hysteric group and 
10 per cent for the dysthymic group, giving a 
total misclassification of only 15 per cent. In 
Fig. 2, the total number of eyeblink CR’s given 
by each of these groups is plotted for each of the 
18 acquisition test trials and the ten extinction test 
trials. 


20 
18 


Total CR's 
3 


See 
24 6 8 10 
Extinction 
Test Trials 
Fic. 2. Total number of eyeblink CR’s given by 
each group at each test trial. (N = 20 in each 


group.) 


As indicated in Table 2 and Fig. 2, there are 
no statistically significant differences in condition- 
ability between normals and hysterics. In ac- 
cordance with the theory presented in this pees 
it is to be expected that normals would differ with 
respect to conditionability from both hysterics and 
dysthymics to approximately the same extent that 
they differed from these two neurotic groups in 
extraversion. In the present study the extraver- 
sion scores of the normal group (see Table 1) 
are more like those of the hysteric group than of 
the dysthymics. This state of affairs probably Oc- 
curred because, although the normal group was 
chosen at random, it was, naturally enough, the 
evtraverted rather than the introverted subjects 
who volunteered for the experiment. This situ- 


2 4 6 8 10 12 14 16 18 
Acquisition 
Test Trials 


® When similar histograms are plotted for the PGR 
data using a cutoff of 3 CR’s, the total misclassifica- 
tion is 30 per cent. When a double criterion is used, 
however, in which to be classified as hysteric the 
patient must have 8 or less eyeblink CR’s and also 
3 or less PGR CR’s, then the two groups may be 
separated with no misclassification whatsoever. 
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ation possibly accounts for the findings that the 
conditioned response behavior of the normals is 
somewhat similar to that of the hysterics although 
the differences in conditionability between these 
two groups are not statistically significant. 

Similar results were obtained for PGR condi- 
tioning; here also the dysthymics conditioned 
much more strongly than the hysterics and than 
the normals, and the hysterics and the normals 
behaved very much the same. Both the hysterics 
and the normals seemed to adapt to the PGR 
fairly rapidly, and the normals tended to resemble 
the hysterics more than the dysthymics with re- 
spect to their extraversion scores. 

Another measure of conditionability was the 
number of reinforcements necessary to achieve 
a criterion of three successive CR’s. This index 
gave a correlation with the acquisition score of 
‚80 for the eyeblink reflex and .78 for the PGR 
reflex, which is in agreement with the finding of 
Humphreys (20), who concluded that to report a 
criterion score and an acquisition score is super- 
fluous since the correlation between these two 
scores is as high as the reliability of the condi- 
tioning measures will allow. 

For the normal group, the test-retest reliability 
over a period of 14-21 days was .52 for the eye- 
blink acquisition score, .37 for the eyeblink ex- 
tinction score and .40 for the PGR acquisition 
score. The PGR Pearson product-moment coeffi- 
cient for the extinction data could not meaning- 
fully be calculated because there were too many 
zero values on retest. The PGR correlation is 
low when compared with that of Welch and Ku- 
bis (36, 37), who obtained a test-retest relia- 
bility of .88, but they used electric shock as the 
US. 

When corrected for attenuation (a procedure 
carried out only where reliability data existed, 
i.e., for the normal group), the correlation be- 
tween the PGR and eyeblink acquisition scores 
was .53. For none of the groups were there any 
significant correlations between any measure of 
conditioning and age, sex, Or intelligence. 

Table 3 presents the Pearson product-moment 
correlations between conditioning and Guilford’s 
R Scale, Taylor’s Manifest Anxiety Scale, and the 
MMQ for the number of eyeblink CR’s during ac- 
quisition and extinction. For the PGR only the 
acquisition CR’s were considered, since adapta- 
tion proceeded too rapidly during extinction. The 


TABLE 3 


CORRELATIONS AMONG CONDITIONING AND 
Guttrorp’s R SCALE, TAYLOR’S ANXIETY SCALE 
AND THE MMQ For ALL Ss 


(N = 60) 
Eyeblink PGR 
Conditioning Conditioning 
Scale 
Acqui- | Extinc- siehe 
sition ion Acquisition 
R — 48 —.37 —.25 
Anxiety +.15 +.16 +.17 
MMQ +.08 +.04 +.20 


correlations between R and the conditioning meas- 
ures are, of course, negative because thathymia 
is a measure of extraversion. Using a one-tailed 
test of significance, all three correlations between 
R and the conditioning measures are highly sig- 
nificant, whereas the correlations between condi- 
tioning and the MMQ are insignificant in all cases. 
Similar results were obtained when the three 
groups were considered separately. Further- 
more, for each of the three conditioning measures 
in Table 3, the correlation between conditioning 
and R is significantly greater (p < .02) than the 
correlation between conditioning and the Taylor 
score and the correlation between conditioning and 
the MMQ. 

That the D and C scales are also measures of 
neuroticism is suggested by the high correlation 
(.88) between D+C combined and the MMQ. 
Taylor's Anxiety Scale correlated .86 with D + C 
combined and .92 with the MMQ (for all 60 Ss). 
As elaborated elsewhere (8), these intercorrela- 
tions strongly suggest that these three scales all 
essentially are measures of neuroticism; hence, 
the insignificant correlations of both the anxiety 
and the MMQ scales with conditioning. 

There were several other relevant findings. The 
hysterics tended to give fewer unconditioned PGR 
responses to the air puffs than the dysthymics. 
When the replies to the post-conditioning ques- 
tions were analyzed, it was found that the dys- 
thymics reported the air puffs as being disturbing 
significantly more often than did the hysterics. 
Although not significant, the hysterics appeared 
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to report that they felt sleepy during the session 
more often than did the dysthymics. 

Discussion of Results — These results would in- 
dicate very strongly as far as the eyeblink reflex 
is concerned, and very possibly for the PGR 
reflex, that conditionability is related to introver- 
sion-extraversion and not to neuroticism, the 
extraverted subjects tending to condition much 
less well than the introverted ones. The results 
also suggest that manifest anxiety is related to 
strong conditionability only to the extent that 
anxious people are introverted. Taylor's Anxiety 
Scale differentiates those subjects who condition 
well from those who condition poorly only to the 
extent that it fails to measure neuroticism. Since 
this scale has a very large projection on the 
neuroticism dimension and only a small projec- 
tion on the introversion dimension, it is probably 
an extremely poor diagnostic measure of condi- 
tionability. This explains the very low correla- 
tion between this scale and conditionability ob- 
tained in the present study and also the only 
slight successes obtained by Spence and his col- 
leagues and by Hilgard, Jones, and Kaplan (/8) 
in their attempts to obtain positive correlations 
between the Taylor scale and conditionability. 

The poor conditionability of the hysterics, their 
more rapid PGR adaptation to the air puffs, their 
subjective reports that the air puffs were not very 
disturbing, and perhaps their reports of feeling 
sleepy all support the hypothesis that hysterics 
are in a state of cortical inhibition in which dis- 
sociation phenomena predominate; in a similar 
manner, the results obtained for the dysthymic 
group support the hypothesis that dysthymics are 
in a state of cortical excitation. Since condition- 
ability in both neurotics and in normals is ap- 
parently related to introversion-extraversion and 
not to neuroticism, it would seem very probable 
that excitation-inhibition is closely related to 
introversion-extraversion.* This conclusion, if 
correct, is of considerable theoretical and prac- 
tical interest, and together with other experimental 
findings, should help to generate a dynamic and 
physiologically based theory of personality struc- 
ture (4) in which the behavior of both the dys- 
thymic and the introverted individual is described 
in terms of overconditioning and cortical excita- 
tion, and the behavior of both the hysteric and 

* This is in agreement with the finding that sodium 


amytal, an inhibitory drug, decreases conditionabil- 
ity and increases extraversion (9), 


CONTEMPORARY RESEARCH IN PERSONALITY 


the extraverted individual is described in terms of 
underconditioning and cortical inhibition. 


SUMMARY 


Pavlov’s concepts of excitation and inhibition 
were related to the dimension of introversion- 
extraversion in normal and neurotic subjects. Nor- 
mal and neurotic subjects were conditioned, using 
the eyeblink and PGR reflexes. It was found that 
(a) anxiety states conditioned much better than 
hysterics, and (b) conditionability is related to 
introversion-extraversion and not to neuroticism. 


REFERENCES 


1. Eysenck, H. J. Dimensions of personality. 
London: Kegan Paul, 1947. 

2. Eysenck, H. J. The scientific study of per- 
sonality. London: Routledge & Kegan Paul, 
1952. 

3. Eysenck, H. J. The structure of human per- 
sonality. London: Methuen, 1953. 

4. Eysenck, H. J. A dynamic theory of anxiety 
and hysteria. J. ment. Sci., 1955, 101, 28-51. 

5. Finesinger, J. E., Sutherland, G. F., and Mc- 
Guire, Frances F, The positive conditioned 
salivary reflex in psychoneurotic patients. 
Amer. J. Psychiat., 1942, 99, 61-74. 

6. Franks, C.M. An experimental study of con- 
ditioning as related to mental abnormality. 
Unpublished doctor’s dissertation, Univer. of 
London, 1954. 

7. Franks, C. M. The establishment of a con- 
ditioning laboratory for the investigation of 
personality and cortical functioning. Nature, 
1955, 175, 984-985. 

8. Franks, C. M. The Taylor scale and the 
dimensional analysis of anxiety. Rev. Psy- 
chol. appl., in press. 

9. Franks, C. M., and Laverty, D. G. Sodium 
amytal and eyelid conditioning. J. ment. Sci., 
1955, 101, 654-663. 

10. Franks, C. M., and Withers, W. C. R. Pho- 
toelectric recording of eyelid movements. 
Amer. J. Psychol., 1955, 68, 467-471. 

11. Grant, D. A. The pseudo-conditional eyelid 
response. J. exp. Psychol., 1943, 32, 139- 
149, 

12. Grether, W. F. Pseudo-conditioning without 
paired stimulation encountered in attempted 
backward conditioning. J. comp. Psychol., 
1938, 25, 91-96. 

13. Guilford, J. P. Inventory of factors STDCR. 
Beverly Hills, Calif.: Sheridan Supply Co., 
1940. 


14. 


15. 


16. 


17. 


18. 


19. 


20. 


21, 


22. 


23. 


24. 


25, 


26. 


27. 
28. 


29. 


30. 


LEARNING, STRESS, AND PERFORMANCE 


Guilford, J. P., and Guilford, Ruth B. Per- 
sonality factors S, E, and M, and their meas- 
urement. J. Psychol., 1936, 2, 109-127. 
Guilford, J. P., and Guilford, Ruth B. Per- 
sonality factors D, R, T, and A. J. abnorm. 
soc. Psychol., 1939, 34, 21-36. 

Guilford, J. P., and Martin, H. G. The Guil- 
ford-Martin inventory of factors GAMIN, 
manual of directions and norms. Beverly 
Hills, Calif.: Sheridan Supply Co., 1945. 
Hildebrand, H. P. A factorial study of intro- 
version-extraversion by means of objective 
tests. Unpublished doctor's dissertation, Uni- 
ver. of London, 1953. 

Hilgard, E. R., Jones, L. V., and Kaplan, S. J. 
Conditioned discrimination as related to anxi- 
ety. J. exp. Psychol., 1951, 42, 94-99. 
Hilgard, E. R., and Marquis, D. G. Condi- 
tioning and learning. New York: Appleton- 
Century-Crofts, 1940. 

Humphreys, L. G. Measures of strength of 
conditioned eyelid responses. J. gen. Psy- 
chol., 1943, 29, 101-111. 

Jung, C. G. Psychological types. 
Kegan Paul, 1924. 

McAllister, W. R. Eyelid conditioning as a 
function of the CS-US interval. J. exp. Psy- 
chol., 1953, 45, 417-422. 

MacCorquodale, K., and Meehl, P. E. Ona 
distinction between hypothetical constructs 
and intervening variables. Psychol. Rev., 
1948, 55, 95-107. 

Pavlov, I. P. Conditioned reflexes. G. V. 
Anrep (Trans.). London: Oxford Univer. 
Press, 1927. 

Pavlov, I. P. Lectures on conditioned re- 
flexes. Vol. 1. The higher nervous activity 
(behavior) of animals. W. H. Gantt 
(Trans.). London: Lawrence & Wishart, 
1928. 

Pavlov, I. P. Lectures on conditioned re- 
flexes. Conditioned reflexes and psychiatry. 
W. H. Gantt (Trans.). New York: Interna- 
tional Publishers, 1941. 

Raven, J. C.R. Progressive Matrices (1938). 
London: Lewis, 1948. 

Schiff, Ethel, Dougan, Catherine, and Welch, 
L. The conditioned PGR and the EEG as 
indicators of anxiety. J. abnorm. soc. Psy- 
chol., 1949, 44, 549-552. $4 
Spence, K. W., and Farber, I. E. Condition- 
ing and extinction as a function of anxiety. 
J. exp. Psychol., 1953, 45, 116-119. 5 
Spence, K. W., and Taylor, Janet A. Anxiety 
and strength of the UCS as determiners of the 
amount of eyelid conditioning. J. exp. Psy- 
chol., 1951, 42, 183-188. 


London: 


277 


31. Spence, K. W., and Taylor, Janet A. The re- 
lation of conditioned response strength to anx- 
iety in normal, neurotic and psychotic sub- 
jects. J. exp. Psychol., 1952, 45, 265-272. 
Taylor, Janet A. The relationship of anxiety 
to the conditioned eyelid response. J. exp. 
Psychol., 1951, 41, 81-92. 

Taylor, Janet A. A personality scale of man- 
ifest anxiety. J. abnorm. soc. Psychol., 1953, 
48, 285-290. 

Taylor, Janet A., and Spence, K. W. The 
relationship of anxiety level to performance 
in serial learning. J. exp. Psychol., 1952, 44, 
61-64. 

Taylor, Janet A., and Spence, K. W. Condi- 
tioning level in the behavior disorders. J. 
abnorm. soc. Psychol., 1954, 49, 497-503. 
Welch, L., and Kubis, J. The effect of anxi- 
ety on the conditioning rate and stability of 
the PGR. J. Psychol., 1947, 23, 83-91. 
Welch, L., and Kubis, J. Conditioned PGR 
(psychogalvanic response) in states of patho- 
logical anxiety. J. nerv. ment. Dis., 1947, 
105, 372-381. 


32. 
33. 


34. 


35. 
36. 


37. 


FRUSTRATION AND THE 
QUALITY OF PERFORMANCE: 
II. A THEORETICAL STATEMENT * 


Irvin L. CHILD AND IAN K. WATERHOUSE 


What is the effect of frustration on the quality 
of performance? There appears to be a dual tra- 
dition in the writings of psychologists and others 
who have given attention to this problem, 

First, there is a tradition that frustration leads 
to improved quality of performance. Dewey's 
often cited account of why thinking occurs stresses 
the role of a problem or difficulty as the occasion 
for creative intellectual activity (8). Difficulty 
in such a situation often is an instance of frus- 
tration. In more general accounts of the psy- 


* Reprinted by permission from Psychological Re- 
view, March, 1953, Vol. 60, No. 2, 127-139. 

1 We are using frustration in a broad sense to refer 
to prevention of a person’s direct progress toward a 
goal, not wishing to prejudge by definition the impor- 


278 
chology of adjustment, unreduced tension is shown 
as giving rise to various forms of adjustment, of 
which some may be of high intellectual quality 
(25). On the level of society as a whole there 
are notions such as Toynbee’s (26)—that the 
protracted existence of a challenge, often in the 
form of difficulty in meeting the needs of bare 
subsistence, is the condition for the joint con- 
structive activity that produces a new civilization. 

Second, there is also a tradition that frustra- 
tion leads to lowered quality of performance. 
This is perhaps the more apparent part of the 
thesis of psychoanalysis and psychology of adjust- 
ment, since, on the whole, adjustments of poor 
quality to frustration have received the greater 
attention from therapists. This tradition is also 
evident in much of the discussion about the dis- 
organizing effects of emotion (as reviewed, e.g., 
by Leeper (15)), inasmuch as emotion is often 
produced by frustration. Barker, Dembo, and 
Lewin’s study of frustration and regression (2) 
is often cited in simple confirmation of this tra- 
dition, to the neglect of the rest of its content. 
Most recently this tradition is represented in 
Maier’s systematization of the effects of frustra- 
tion (20), as most of the effects he deals with 
would doubtless be considered to be of poor in- 
tellectual quality. 

There is, then, an apparent conflict of belief 
in this matter. Indeed, the conflict appears strik- 
ingly in some general textbooks in psychology. 
In a chapter on thinking and reasoning frustra- 
tion is viewed as the condition for more organized 
behavior, and in a chapter on emotion it is viewed 
as the condition for less organized behavior. The 
failure to use a common term such as frustration 
in the two chapters apparently permits the con- 
tradiction to go unnoticed. 

Is this apparent contradiction due merely to 
failure to appreciate the role of severity of frus- 
tration, minor frustrations leading in fact to an 
improvement in quality of performance and major 
frustrations to the opposite, as might be inferred 


tance of various distinctions that can be made among 
the variety of events that fit this definition. We 
heartily agree with Brown and Farber’s emphasis on 
the need to distinguish sharply between this definition 
of frustration and its definition as referring to a state 
of the organism (5, p. 480). But we feel it more 
useful to apply the term to the event of prevention 
of a person’s progress toward his goal than to a 
state which may in some cases be inferred from the 
event. 
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from the settings in which these contrary effects 
are often discussed? Presumably not in any very 
uniform way, else why would anyone swear when 
he stubbed his toe, and how could any prisoner 
ever carry through successfully an ingenious plan 
for escape? 

The greatest advance toward resolving this con- 
tradiction has been made by Barker (/) and by 
Barker, Dembo, and Lewin (2). By drawing 
upon their contributions, upon other aspects of 
psychological theory, and upon evidence obtained 
in a variety of pertinent studies, we hope to ad- 
vance still further towards an understanding of 
the factors which influence the direction of change 
in quality of performance that results from frus- 
tration. 

We have found it convenient to deal with three 
problems which it is useful to separate for pur- 
poses of analysis: 

I. Effects of frustration in one activity upon 
the quality of performance in a second activity. 

Il. Effects of frustration in one activity upon 
the quality of performance in that activity. 

Ill. Effects of frustration upon the quality of a 
person’s behavior as a whole. 

The three sections of this paper will be devoted 
to these three problems in turn. For the sake 
of brevity only one of these problems—the second 
one—has been selected for detailed treatment. 


I, EFFECTS OF FRUSTRATION IN ONE ACTIVITY UPON 
THE QUALITY OF PERFORMANCE 
IN A SECOND ACTIVITY 


The well-known experiment of Barker, Dembo, 
and Lewin is presented by those authors as deal- 
ing with a generalized effect of frustration upon 
the constructiveness of a person’s behavior as a 
whole (2, p. 46). Actually, a critical analysis 
of the procedures and results indicates that it can 
only be said with certainty to deal with the ef- 
fects of frustration in one activity upon the qual- 
ity of performance in a second activity. “The 
activity frustrated was children’s play with a 
highly attractive set of toys; the second activity, 
in which quality of performance was measured, 
was play with a much less attractive set of toys. 
The theoretical discussion by Barker, Dembo, and 
Lewin, like their data, is most directly relevant 
to the problem of this section. 

In discussing this problem Barker (Z) and, less 
sharply, Barker, Dembo, and Lewin (2) make a 
definite contribution to an understanding of the 
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factors which determine whether frustration will 
lead to a better or poorer quality of performance. 
The suggestion they make about frustration in 
relation to poorer quality of performance we 
would rephrase as follows: frustration of one ac- 
tivity will produce lowered quality of performance 
in a second activity to the extent that it leads to 
the making of responses which interfere with the 
responses of the second activity. Barker, Dembo, 
and Lewin minimize the role of this sort of 
hypothesis in explaining their results. We have 
shown in a previous paper (6), however, that 
their results actually support this hypothesis very 
strongly; and we feel that this is the most impor- 
tant empirical contribution of their study. 

The opposite effect, improved quality of per- 
formance, is ascribed by Barker, Dembo, and 
Lewin to what we would call an increase, result- 
ing from frustration of one activity, in the 
strength of drives which support the second activ- 
ity. Barker (1) suggests three conditions under 
which such drives are likely to be strengthened in 
a way which results in increased quality of per- 
formance. We would rephrase them as follows: 

1. When the second activity can be and is mo- 
tivated in part by the original, unreduced drive 
which had been motivating the frustrated activ- 
ity, so that the second activity functions as a sub- 
stitute for the first. 

2. When frustration-produced drive leads to an 
attempt to escape from reminders of the frustrated 
activity, and preoccupation with the second activ- 
ity is the mode of escape hit upon. 

3. When the person was previously especially 
unmotivated with respect to the second activity, 
for it is then supposed that quality of performance 
may be favorably influenced by increased drive 
more than it is unfavorably influenced by inter- 
ference. 

These all seem to be significant suggestions, and 
in each case allied fields of research could pro- 
vide evidence that indirectly supports their plausi- 
bility. They have not, however, been tested sys- 
tematically in research on quality of performance, 
though they are drawn upon by Barker, Dembo, 
and Lewin in interpreting the behavior of indi- 
vidual subjects who in their experiment showed 
an increase instead of a decrease in constructive- 
ness after frustration (2, pp. 179-186). 

This contrast between the effects of interfer- 
ence and of increase in relevant drive, resulting 
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from frustration, seems to us of fundamental im- 
portance, though it leaves many questions un- 
answered. This same contrast will be made in 
connection with the second problem, to be con- 
sidered in the following section of this paper. 
Other points to be made in the following section 
can also be applied, with modification, to the pres- 
ent problem, but we shall discuss them explicitly 
only with reference to the second problem. 
There remains to be made here, however, a spe- 
cial point about the interference effect of frus- 
tration upon a second activity, a point which is 
distinctive for the problem of this section and 
essential for putting into proper perspective the 
role of frustration here. 

The point is this: frustration of the first activ- 
ity may, in comparison with active pursuit of the 
first activity, actually increase the quality of the 
second activity by reducing the amount of inter- 
ference with it. This is particularly likely to be 
true if the two activities are essentially alterna- 
tives of which the first activity is the preferred 
or dominant one. For if in this case the pre- 
ferred activity is being pursued without frustra- 
tion, all the overt responses which make it up 
are present to interfere with possible pursuit of 
the second activity. If, on the other hand, the 
preferred activity is thoroughly frustrated, there 
may remain, as possible sources of interference 
with the second activity, only implicit tendencies 
to return to the preferred activity. Interference 
arising solely from implicit tendencies, from 
thoughts, seems likely on the whole to be much 
less severe than interference arising from suc- 
cessful overt pursuit of a dominant activity. We 
suggest that one aspect of the Barker, Dembo, 
and Lewin experiment can probably be viewed 
in this light, though the design of their experi- 
ment does not permit our suggestion to be tested. 
We can only illustrate our meaning by suggesting 
a variation of conditions which was not actually 
used in their experiment. 

The constructiveness of children’s play with 
relatively unattractive toys was initially measured 
in a free-play period, with no other toys in sight. 
Later, the constructiveness of their play with 
these same toys was measured during a frustra- 
tion period, in which the children had just been 
interrupted in play with more attractive toys and 
these more attractive toys remained in sight be- 
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hind a wire barrier.2 The constructiveness of 
play with the unattractive toys was lower during 
the frustration period than it had been during the 
free-play period; but still, it was an activity of 
considerable constructiveness, or quality. Our 
contention is that the constructiveness of play 
with the attractive toys would not have been as 
high as it was, had it not been for the frustration 
arising from inability to play with the attractive 
toys. For, suppose that instead of being frus- 
trated, the children had been allowed to continue 
play with the more attractive toys, the unattrac- 
tive toys being put off by themselves in another 
part of the room. What, in this case, would 
have been the quality of performance in the sec- 
ond activity, i.e., interaction with the unattrac- 
tive toys? We would predict that it would fall 
into a very much lower level still—that it would 
be largely confined to glances and sporadic be- 
ginnings of play, rapidly interrupted by return 
to the more attractive toys. 

Frustration of a preferred activity, then, may 
produce for a second activity a degree of inter- 
ference which is intermediate—intermediate be- 
tween the greater interference which would have 
occurred in the absence of the frustration and the 
lesser interference which would have occurred in 
the total absence of the preferred activity. 

It is nonetheless true that frustration of one 
activity may constitute a genuine source of defi- 
nite interference with a second activity. This fact 
seems to be clearly demonstrated in two experi- 
ments which were intended to deal with another 
problem, that of repression. In both cases there 
was, as it happened, no question of a more pre- 
ferred and a less preferred activity; the instruc- 
tions of the experimenter simply required the 
subject to work at two activities successively, and 
it was possible to observe the effect of frustration 
in one activity upon the quality of performance 


2 For the purposes of the general point under dis- 
cussion it should be noted that the play with the 
attractive toys is here regarded as the first activity, 
and the play with the unattractive toys is regarded 
as the second activity. 

3 This prediction, as applied to the Barker, Dembo, 
and Lewin experiment, is complicated by the fact that 
children could integrate the two sets of toys in a 
single play activity. For our point to be made, one 
must suppose that the rules of the situation did not 
permit this integration—a restriction which, for many 
situations to which one would wish to generalize, is 
imposed by the very nature of the activities. 


in the other activity. Sears (24) found that 
frustrating subjects in their attempts to do well 
at a‘ card-sorting task led to a lowered quality of 
performance in a learning task. Zeller (27) 
found that frustrating subjects in their attempts 
to do well at the Knox Cube Test also led to a 
lowered quality of performance in a learning 
task. In both instances it appears that frustration 
in one activity led to internal responses (worry, 
for example) which interfered with maximally 
effective prosecution of a second activity. 

It is probable that for Barker, Dembo, and 
Lewin’s subjects as well, there was interference 
genuinely arising from the frustration. This is 
suggested by the fact that the overt responses of 
their subjects included attempts to escape from 
the situation, a response which doubtless con- 
tributed to the total interference and appears to 
be a response specifically to the frustration. But 
the total interference was probably much less 
than it would have been without the frustration. 

In sum, then: Where quality of performance 
is lower than might be expected, and this lower- 
ing appears to be connected with the course of 
other activities, frustration of other activities is 
one possible source of interference; but successful 
pursuit of other activities may be a more impor- 
tant one. The college student who is frustrated 
in his attempts to arrange a date for the evening 
may not learn his German vocabulary that eve- 
ning as well as he could; but it’s a good bet that 
he’ll learn it better than if he had had a date. 


ll. EFFECTS OF FRUSTRATION IN ONE ACTIVITY 
UPON THE QUALITY OF PERFORMANCE 
IN THAT ACTIVITY 


In dealing with the effects of frustration in one 
activity upon the quality of ongoing performance 
in that activity itself, we shall organize our dis- 
cussion under five main headings. These repre- 
sent five kinds of process or event which may 
influence the effect that frustration has on ‘the 
quality of performance. This analysis has been 
difficult, because the several processes or events 
are closely interconnected and in many an in- 
stance would all be operating at once. We be- 
lieve, however, that this sort of analysis is useful 
for reaching an understanding of the effects we 
are dealing with. 

A. Extinction of the Initial Response to the 
Situation—When a person is frustrated in some 
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activity, the situation to which he is responding 
is thereby somewhat changed. The extent to 
which it is changed, however, varies, and in some 
instances it may be useful, in predicting his re- 
sponse, to consider the situation to which he is 
responding as essentially the same as it was before 
the frustration. Where this is a useful approach 
to make, Hull’s concept of the habit-family hier- 
archy (12), expressed in a somewhat more gen- 
eral form, suggests the importance for our prob- 
lem of the extinction of the initial response to 
the situation. 

A person may be conceived of as having, in 
any specific situation, tendencies to make various 
response sequences which may all potentially lead 
to the goal towards which he is oriented. These 
various response sequences may be thought of 
as a hierarchy, the various members of which 
differ in habit strength (that is, in the strength of 
the tendency for them to be evoked). The se- 
quence for which the habit strength is initially 
strongest will be the one first evoked. If the re- 
sulting activity is frustrated, its habit strength is 
diminished by the process termed extinction. 
With persisting frustration, its habit strength may 
be reduced below that of the other members of 
the hierarchy. At this point the other response 
sequences in the hierarchy will begin to be evoked. 
The effect of frustration upon the quality of per- 
formance in this case, then, will depend upon the 
relative quality of the initial response sequence 
which is extinguished and of the other response 
sequences which are then evoked instead. 

On the whole, it seems likely that the initial 
response sequence will be the sequence of highest 
quality in the hierarchy. The reason for this 
expectation is that the response sequence of high- 
est quality is likely to have been the most strongly 
and consistently rewarded in similar situations 
in the past, and thus to have become the response 
sequence of greatest habit strength and the one 
first to be evoked.* 

Several experiments on frustration appear to 
illustrate this effect of extinction of initial re- 
sponse upon quality of performance, though in 
each case it is quite likely that other processes 
to be considered later were also influential. 
Three experiments can serve as particularly apt 
examples. In each case the subject was re- 


4 We are indebted to Dr. Gregory A. Kimble for 
suggesting this point to us. 
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quired to engage in some more or less intellectual 
task. In an experiment by Sears (24), and in 
one by McClelland and Apicella (/6), the task 
involved card sorting; in an experiment by Post- 
man and Bruner (22), the task consisted of at- 
tempting to perceive and report words as their 
exposure time was gradually increased from a 
subliminal value. In a subject faced with any 
of these tasks, there is evoked a general mode of 
responding which is likely to be about the most 
adaptive of which he is capable. But in each of 
these experiments, as the subject responded and 
continued to respond in this adaptive manner, the 
experimenter withheld the normal reward of 
knowledge of satisfactory performance, and in- 
stead told the subject he was failing miserably, 
doing worse than anyone else, etc. The lowered 
quality of performance which then appeared in 
each of these experiments may well have been 
due primarily to the extinction, resulting from 
nonreward, of the subject's adaptive response 
tendencies, together with the fact that for this 
situation the subject did not have any alternative 
response tendencies which would at all rival the 
initially dominant one in the quality of perform- 
ance to which they would lead. 

The effect of extinction of the initially domi- 
nant response tendency is not, of course, neces- 
sarily a lowering of quality of performance. The 
order of response sequences in the hierarchy for 
the given situation may be determined by gen- 
eralized effects of learning which took place in 
a previous situation (or situations) which was 
appreciably different from the present one. In 
particular, the previous and present situations 
may differ in the quality of performance which 
would be judged to characterize particular re- 
sponse sequences if evoked in those situations. 
Thus the present situation may first evoke a 
response sequence which was of high quality in 
former situations but of low quality in this sit- 
uation; extinction of the tendency to respond 
with this sequence may in that case lead to the 
evocation of a response sequence lower in the 
hierarchy but of higher quality in this situation. 

B. Situational Changes. —In the preceding sec- 
tion we considered certain implications that follow 
when the frustrated person may be considered 
to be still responding to essentially the same sit- 
uation. In this section we turn to certain im- 
plications which follow when the character of the 
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frustrating circumstances is such that the person 
must be considered to be now responding to a 
situation very different from that which preceded 
the frustration. We shall not deal here with the 
fact of frustration itself as a new element in the 
situation to which the person is distinctively re- 
sponding; this matter we shall discuss in section 
D. We shall deal here simply with specific 
changes in the situation which are inherent in 
the specific manner by which the frustration is 
brought about. 

The point we are concerned with here is this: 
One effect of frustration is to alter the person’s 
situation in such a way that behavioral possibili- 
ties are changed, and this alteration has implica- 
tions for the possible quality of the person’s 
performance. 

On the one hand, frustration may alter the 
situation in such a way as to render impossible 
any responses of high quality directed at the 
original goal. There is an approach to this con- 
dition in the Barker, Dembo, and Lewin experi- 
ment (2). The highly constructive behavior of 
complex play with the desirable toys was rendered 
impossible by making those toys completely un- 
available to the child. If the constructiveness of 
behavior in relation to the goal of playing with 
those inaccessible toys was characteristically re- 
duced by frustration (no systematic evidence was 
in fact collected on this point), was it not largely 
because this highly constructive behavior was 
made impossible and no other equally constructive 
behavior in relation to that goal was possible for 
most of the children? Parallel examples from 
nonlaboratory situations come readily to mind. 
For the man whose beloved marries someone 
else, the formerly most constructive behavior of 
striving by appropriate means to gain her affec- 
tion is now impossible if he accepts the morality 
of this society, Indeed, the situation is now so 
changed that there may be no possibility at all 
of constructive behavior directed at his original 
goal; if he is still so strongly driven toward this 
goal as to make some kind of response in that 
direction, it must of necessity be of poor intel- 
lectual quality, as, for example, wish-fulfilling 
fantasies, or various nonconstructive social acts. 

On the other hand, frustration may alter the 
situation in such a way as to make possible the 
achievement of the goal by acts of higher intel- 
lectual quality than were previously possible or 
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appropriate. The man who is digging a hole with 
his spade, in order to plant a tree, has no more 
constructive behavior open to him than the sim- 
ple routine of digging; though this is the most 
efficient and adaptive behavior under the cir- 
cumstances, it is not of very high intellectual 
quality. But if the handle of the spade breaks, 
and he is thus frustrated, more complex and more 
constructive behavior, of higher intellectual qual- 
ity, now becomes possible and indeed essential 
as a means to the original goal. The skill of 
digging with a spade must now be integrated 
with the skill of shaping a new handle, or with 
the social skills involved in borrowing or buying 
another spade in a much more complex sequence 
of behavior leading to the original goal. This 
sort of effect may be seen in the Barker, Dembo, 
and Lewin study if one looks solely at the means 
by which the child achieved, or might have 
achieved, contact with the desirable toys. When 
they were freely available to him, he simply ap- 
proached and touched them. When a barrier 
was interposed, the only behavior that might pos- 
sibly have led him to these toys was a much more 
complex sequence of influencing the experimenter, 
though, as it happened, it had been predetermined 
that even this should not be successful. 

C. Quality of the Responses Available for Per- 
formance.—In sections A and B we have shown 
that the elimination of one response, as a result 
of frustration, may influence the quality of per- 
formance, In section A, we considered the 
elimination of one response through extinction. 
In section B, we considered the elimination of one 
response because of the removal of some kind 
of environmental facility or support which is 
essential for its performance. Just how this 
elimination of one response will affect the quality 
of performance depends, of course, both upon the 
quality of the eliminated response and upon the 
quality of the other responses which then come 
to be made. We must now consider explicitly, 
therefore, the question of what variables influence 
the quality of the responses available in the per- 
son’s repertoire and likely to be made if frustra- 
tion eliminates the initial response. 

In the case we have dealt with in section A, 
where the person may be considered as respond- 
ing to essentially the same situation before and 
after frustration, we have already suggested that 
Hull’s concept of the habit-family hierarchy pro- 


LEARNING, STRESS, AND PERFORMANCE 


vides a useful theoretical schema for dealing with 
this problem. The quality of the new behavior 
resulting from frustration would be predicted 
from the quality, as responses in this situation, 
of the responses next to the initially dominant one 
in the hierarchy. Actual application of this 
schema, of course, requires measurements both 
of quality and of habit strength of response se- 
quences. Such measurements are certainly pos- 
sible for complex human behavior, and have been 
made in connection with other problems. With 
reference to studies already done which are di- 
rectly relevant to this problem, however, the 
schema can only be applied by using gross judg- 
ments of great differences in quality and habit 
strength between the initial response sequence 
and other response sequences, as in our inter- 
pretation of the experiments we cited in section 
A. 

In the case we have dealt with in section B, 
where the situation to which the person is re- 
sponding must be considered as radically changed, 
the same theoretical schema of Hull’s may be 
considered as sometimes applicable. Here the 
quality of performance after frustration would be 
predicted from knowledge of the quality repre- 
sented by the response sequence highest in the 
habit-family hierarchy for this new situation. 
We know of no research studies, on just this 
point, to which this mode of analysis is readily 
applicable. It is obviously applicable, however, 
to incidents of everyday life. Imagine a person 
driving his car to work who is frustrated by a flat 
tire, which radically changes his immediate situa- 
tion. The quality of his response, ¢.g., swearing 
and sulking vs. changing the tire or calling a 
repair man, seems likely to be influenced by what 
particular response tendencies to this changed 
situation have become dominant as a result of 
his previous experience in similar situations. This 
sort of analysis should also be useful in leading 
to systematic research. 

Regardless of how the elimination of the initial 
response is brought about, however, the concept 


5 The only close parallel in systematic research that 
we know of is in a study by Davitz (7), which is 
concerned with the problem of section I of this paper. 
He frustrated children in an enjoyable activity of 
watching movies and eating candy. Their subsequent 
responses in a free-play situation were found to be 
influenced, in a way relevant to quality of perform- 

` ance, by previous training in a similar situation. 
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adequate to deal with all cases. For in many 
cases the initial response sequence is replaced, not 
by some other response sequence which has a 
predictable habit strength resulting from previous 
reward or nonreward in similar situations, but 
rather by some novel response sequence which 
has never previously been performed by the per- 
son in any situation. 

Now, under these circumstances, the problem 
of predicting the effect of frustration upon quality 
of performance becomes the specific problem of 
predicting whether a person, in the face of frus- 
tration, will produce novel responses and whether 
these novel responses will be of high or low 
intellectual quality. This specific problem is one 
to which a considerable body of scientific research 
is relevant. Relevant research is indeed so 
voluminous that we cannot hope to review it here. 
All we wish to do here is call attention to its 
relevance to the problem of frustration, for most 
of this research has been performed, and has 
been discussed, in contexts far afield from the 
dynamics of frustration reactions. 

First of all, there is research on intelligence as 
an organismic variable which influences the per- 
son’s reactions in a variety of situations. If in- 
telligence tests measure so broadly relevant a 
variable as is often hoped, that variable should 
be highly useful in predicting the quality of a 
person’s response to frustration—in predicting, 
in other words, the likelihood that a frustrated 
person will hit upon a novel response of high 
quality rather than persisting in an unsuccessful 
response or making novel responses of poor 
quality. 

Second, if this first point is correct, research 
on determinants of intelligence is also relevant 
to the present problem. If heredity, nursery- 
school training, institutionalization, intellectual 
character of the home environment, etc., influ- 
ence general intelligence, they should influence 
the likelihood that the frustrated person will make 
a novel response of high quality. 

Finally, research on factors in the immediate 
situation which influence the adequacy of reason- 
ing and problem solution is relevant to the present 
problem, Such research has not ordinarily been 
formulated as dealing with frustration, When 
we speak of frustration, we ordinarily think of 
a person as at first anticipating steady progress 
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toward his goal, and at a later point encountering 
a barrier. In experimental studies of reasoning 
and problem solution, on the other hand, the 
barrier is generally present at the outset; the sub- 
ject is asked to orient himself toward a goal 
which is obviously difficult to attain. But the 
determinants of quality of response under these 
special circumstances should certainly help to 
illuminate also response to problem situations as 
they arise in the form of frustration in normal 
life. 

As this research on situational variables in 
problem solution is less widely known than that 
on intelligence and its determinants, it may be 
pertinent here to cite examples. A series of 
experimental studies by Maier provide apt exam- 
ples. Maier demonstrates that the likelihood that 
a subject will make the novel and correct response 
to a problem situation is influenced by demon- 
stration to the subject of part-responses which 
are required for it (18), by giving to the subject 
a particular kind of part-response which Maier 
terms “direction” (77), and by instructions de- 
signed to establish a general set towards flexibility 
of response (79). Such variables as these seem 
clearly applicable to understanding the quality 
of response to frustration in everyday life. 

Much of the research on the availability of 
novel responses of high quality has been done in 
a strictly empirical context. This research is 
too diversified for us to attempt here any the- 
oretical integration. We would like to point out, 
however, that the same kind of behavioristic 
analysis which we apply elsewhere in this paper 
is applicable here too. Examples of its applica- 
tion may be found in a theoretical paper by Hull 
(13) which discusses an experiment by Maier on 
“reasoning” in rats, and in Dollard and Miller’s 
recent discussion of problem-solving behavior in 
human beings (9). That such an application of 
systematic theory may be fruitful for research in 
this area may be illustrated by recent studies by 
Birch (4) and by Gladstone (10), which demon- 
strate an influence of learning on the availability 
of responses for problem solution. 

D. Habits of Responding to Frustration—We 
have so far considered the person as responding 
to the situation in which frustration is occurring, 
but not as responding to the fact of frustration 
per se. But the occurrence of frustration is, of 
course, itself a distinguishable aspect of the sit- 
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uation to which the person may respond distinc- 
tively. A person might conceivably have general 
habits of responding to all frustrations, or he 
might have more specific habits of responding 
to particular classes of frustration which for him 
were distinctive. The possible responses which 
might, in various individuals, have come to be 
elicited by the cue of frustration, are of course 
innumerable. We propose to call attention here 
to several classes of response which appear to 
have a special relevance for the influence of frus- 
tration upon quality of performance. 

1. Persistence vs. withdrawal. Persistence in 
striving for the goal, in the face of frustration, 
is a response which keeps the individual in the 
situation and makes possible the emergence of 
novel responses of high intellectual quality, though 
whether such responses do in fact emerge will 
then depend upon such variables as those con- 
sidered in section C. The degree of persistence 
appears to be in part determined by habits of 
response to frustration. Grosslight and Child 
(11) showed, for one experimental situation, 
that subjects who had been subjected to frustra- 
tion in the experiment and rewarded for per- 
sistence, subsequently persisted much longer in 
the face of continuous frustration than did sub- 
jects who had experienced only success until the 
time of continuous frustration.* In the same 
study tentative evidence was found that the first 
group of subjects, as a result of their persistence, 
were more likely to make novel or creative re- 
sponses of a sort which under many circum- 
stances would lead to a removal of the frustra- 
tion. The second group of subjects, on the other 
hand, were more likely simply to withdraw from 
the situation or confine their responses to mere 
staring. 

2. Interfering responses. Another difference 
among persons in their habits of responding to 
frustration has to do with their tendency to make 
responses which interfere with effective pursuit of 
the original activity and thus lower its quality. 
Thus Waterhouse and Child? used a question- 
naire to measure the extent to which individuals 
habitually respond to frustration with potentially 
disruptive reactions, such as aggression, self- 

®A finding, of course, parallel to the typical out- 
come of experiments on partial reinforcement using 
traditional conditioning techniques (14). 

1 Frustration and the quality of performance. II. 
An experimental study. J. Pers., 1953, 22, in press. 
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blame, and self-justification. They found that 
people scoring high on this personality measure, 
when subjected to experimental frustration, 
showed a lowered quality of performance; people 
scoring low on this personality measure, on the 
other hand, when subjected to the same experi- 
mental frustration, actually showed an improved 
quality of performance. A closely parallel find- 
ing was obtained by Mandler and Sarason (2/; 
cf. their Predictions 4 and 5) in a study using a 
different kind of intellectual performance and a 
different measure of tendency to make interfer- 
ing responses, a questionnaire designed to measure 
degree of anxiety as a response to examination 
situations. This finding is still further confirmed 
in another experiment by Sarason, Mandler, and 
Craighill (23). We may tentatively conclude 
that strong habits of responding to frustration, 
or to the anticipation of frustration, with re- 
sponses which interfere with a complex intellec- 
tual activity tend to lower the quality of such 
activity when it must be performed in a frustrat- 
ing situation. 

3. Drive-producing responses. Among the re- 
sponses a person may make to frustration are 
internal responses which create or strengthen 
drive states. Indeed, some of these drive-pro- 
ducing responses are among the interfering re- 
sponses we have mentioned in the preceding para- 
graph. Drive-producing responses, however, 
have two special properties in relation to the 
present problem. 

(a) Certain drive states produced in response 
to frustration may operate to increase the motiva- 
tion supporting the goal-oriented activity and 
thereby to improve the quality of performance. 
Individuals who have habitual tendencies to re- 
act to frustration with responses which create or 
strengthen these particular drive states would 


8 Brown and Farber (5) have recently published an 
article which, while not focused on the problem of 
quality of performance, is highly relevant at this 
point to our treatment of this problem. Their a ‘emo- 
tional’ interpretation of frustration behavior” might 
be regarded in large part as a much more thorough 
attack on the problem we deal with under the label 
of “drive-producing responses.” We differ from them 
in viewing an emotional interpretation, and what they 
call nonemotional interpretations, which would in- 
clude most of the rest of our treatment, not as al- 
ternative approaches (5, p. 480) but as two aspects 
of theory which need to be put together for the 
prediction of behavior. 


then improve their performance in the face of 
frustration. There appears to be no published 
research which provides clear evidence directly 
relevant to this notion, or which would indicate 
what particular drive states operate in this way, 
but this general notion appears to us to be a 
useful guide to future research. 

(b) Quality of performance is likely to be 
greatly influenced not only by the drive states 
created by frustration, but also indirectly by other 
responses which are evoked by those drive states. 
The individual’s habits of responding to drive 
states—in particular to the drive states likely to 
be evoked by frustration—thus are crucial in de- 
termining the effect of frustration upon the qual- 
ity of his performance in the original activity. 
Among the drive states likely to be evoked by 
frustration are states of intense general emotion. 
These emotional states provide an apt example 
to illustrate the point we wish to make here. 

Psychology textbooks often refer to the dis- 
organizing effects of severe emotion (15). Un- 
doubtedly severe emotion does often have a dis- 
organizing effect and thus reduces the quality of 
performance in the face of frustration. In part 
this may be because the emotional responses 
themselves are to some extent incompatible with 
the ongoing instrumental activity. But in even 
greater part the disorganizing effect may have 
to do with responses to the emotional state. A 
typical person in our society is likely to have 
well-established tendencies to react to strong emo- 
tion with various responses—such as withdrawal 
from the emotion-arousing situation, close atten- 
tion to the emotional experience, worry, expres- 
sive behavior such as swearing and gesturing— 
which all tend to interfere with efficient pursuit 
of the original goal-oriented activity. We would 
suspect that persons with a different habit struc- 
ture might react to the same emotional states in 
themselves with a higher, rather than a lower, 
quality of performance. This appears to be the 
assumption underlying certain aspects of military 
training and implicit in the belief that seasoned 
troops are more dependable than inexperienced 
ones—the assumption that training can modify 
the way a person responds to an intense emo- 
tional state, indeed can modify it so radically that 
intense emotion may come to have an organiz- 
ing rather than a disorganizing effect on behavior. 
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E. Situational and Task Variables in Relation 
to the Fact of Frustration—In section D we have 
shown that the person’s habits help determine 
whether his response to the fact of frustration 
will be such as to improve, or such as to detract 
from, the quality of his performance. In this 
section we wish only to point out briefly that 
the person’s response to the fact of frustration 
will also be influenced by a variety of situational 
and task variables. The same kinds of response 
to frustration remain pertinent here. 

Differences in instructions or in initial set 
given by the situation, for example, may in- 
fluence the likelihood that frustration will evoke 
persistent striving or, on the other hand, with- 
drawal. Various specific circumstances in the 
situation may help determine whether frustration 
evokes responses which interfere with the original 
activity, and what effect it has on drive states and 
on responses to these. The extent to which 
heightened drive can lead to improved perform- 
ance, and the extent to which other responses are 
incompatible and produce interference, may vary 
with the exact nature of the task or activity in 
which quality of performance is being judged. 
In all these ways, then, the kinds of responses 
we have considered in section D are relevant to 
the effect of frustration on quality of perform- 
ance, but relevant not only as a function of the 
person’s habits but also as a function of situa- 
tional and task variables. 


IL. EFFECTS OF FRUSTRATION UPON THE QUALITY 
OF A PERSON’S BEHAVIOR AS A WHOLE 


In the analysis we have presented thus far, 
frustration and its consequences may play a very 
unimportant part in the total life of the person. 
Yet there is nothing about the analysis we have 
presented that restricts it to such cases. We wish 
to illustrate here the applicability of the analysis 
to cases of more pervasive effects on quality of 
behavior, though detailed application to any one 
problem is beyond our present intent. 

One sort of case in which the question of a 
pervasive effect of frustration on quality of per- 
formance arises is the situation where the frus- 
trating circumstances are such as to interfere with 
not one, but a great many of the person’s striv- 
ings. Several such situations have been studied 
by psychologists and other social scientists, e.g., 
internment in a concentration camp, unemploy- 
ment, and subjection to severe acculturation pres- 
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sures. As a single example, to illustrate the appli- 
cability of our analysis, we refer to Bettelheim’s 
study of concentration camp inmates (3). 

Where very widespread frustration is imposed 
by the environment, as in concentration camps, 
it seems clear that the elimination of old responses, 
partly through impossibility of performance and 
partly through extinction, is a first determining 
factor. But with old responses eliminated there 
is a wide range in the quality of the new responses 
that appear. The availability of specific responses 
to this situation will be one potent influence; an 
example is the role that Bettelheim’s scientific 
background must have played in enabling him as 
an individual to make a highly constructive ad- 
justment to a concentration camp, through react- 
ing to the situation in part as an intellectual chal- 
lenge (3, pp. 422-424). Another potent influ- 
ence is likely to come from the individual's habits 
of responding to frustration; an example of this 
is provided in Bettelheim’s account of the per- 
sistent submission to authority of the German 
middle class as a factor interfering with a con- 
structive adjustment to a concentration camp (3, 
pp. 425-426). Finally, the role of situational 
factors may be illustrated indirectly by Bettel- 
heim’s point that the typical adjustment of various 
categories of prisoners differed because of the dif- 
ferent significance of imprisonment for them in 
view of their backgrounds (3, pp. 424-429). 

A somewhat different kind of case in which the 
question of pervasive effects on quality of per- 
formance arises, is the case in which a person’s 
behavior is simply noticed to be conspicuously 
high or low in general quality, and there is the 
possibility that this condition is in part a product 
of psychodynamics. Psychotics, some neurotics, 
and some apparently feebleminded persons may 
be seen as persons with a generally low quality 
of response which seems to result from dynamic 
processes of adjustment to life situations. “Frus- 
tration” is certainly no adequate label—if there be 
one—for the great variety of situations to which 
is attributed an important role in these processes. 
Yet frustration certainly is one of the variables 
relevant to understanding these processes. The 
opposite extreme, of the genius with a generally 
very high quality of performance, has been little 
studied from a psychodynamic point of view, SO 
that we can hardly venture a judgment about the 
probable role of frustration here. 

In cases of neurosis or psychosis the applica- 
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bility of our general approach is very aptly illus- 
trated by certain aspects of the recent book by 
Dollard and Miller (9). In their interpretation 
of such pervasively maladjustive behavior, they 
stress the role of a person’s previously established 
habits of response to strong emotion or to specific 
drives. In particular, habits which prevent the 
correct labeling of an emotion or drive tend to 
reduce the quality of behavior by eliminating the 
potentialities for fine discrimination and appro- 
priate generalization which may be brought into 
a behavior sequence by the use of correct labels. 
An account of neurotic or psychotic behavior in 
such terms tries to explain much more than the 
patient’s reaction to frustration; it tries to explain 
also, for example, his reactions to goal attainment. 
But the point relevant here is that, if this is a use- 
ful explanation of neurotic behavior in general, 
it is also specifically an explanation of why neu- 
rotics may tend to make responses of poor quality 
in frustrating situations. 
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THE EFFECTS OF PSYCHOLOGICAL 
STRESS UPON PERFORMANCE * 
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AND SONIA F. OSLER* 


An understanding of the effect of psychological 
stress upon skilled performance is of great theo- 
retical and practical importance. People are often 
faced with the necessity of performing skilled 
work under conditions which are highly stressful. 
Such is obviously the case in military combat. 
The effectiveness of a pilot, gunner, or radar ob- 
server must be maintained even when he is threat- 
ened by physical injury or harassed by the need 
to hurry the performance of a complicated task. 
The obvious fact that human beings are often 
required to work under stress does not call for 
further elaboration. 

The problems of stress involve questions of 
emotions, motivation, and learning. Many theo- 
Tetical issues in these fields are of basic impor- 
tance in an analysis of the effects of stress upon 
performance. This fact has been recognized; most 
of the experimental work upon stress has been 
undertaken for theoretical rather than practical 
reasons. The problem of the effects of stress cuts 
across many fields. We are able to draw many 
hypotheses concerning stress from the theoretical 
constructs of motivation, emotion, and learning. 

We shall begin the discussion of the work in 
this field with an analysis of the concept of stress. 
Following this we shall describe the various ex- 
perimental techniques used to induce stress. This 
aspect of the problem is important because there 
are many methodological difficulties inherent in 
the production of a genuinely stressful situation. 
Then we shall review, briefly, the kinds of per- 
formance which have been studied under stress. 
Finally, we shall discuss some of the theoretical 
implications of the work on stress and skilled 
performance. 
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THE CONCEPT OF STRESS 


It is not possible to discuss intelligently the 
work on psychological stress without dealing with 
the problem of the concept of stress. The defi- 
nitions of stress that have been given from time 
to time are inadequate for several reasons. It is 
possible to think of stress in terms of situations. 
For example, we say that a crucial examination 
is stressful to the participants, or that combat is 
stressful to soldiers. One difficulty with this ap- 
proach is that these situations are not reacted to 
uniformly by all people. We cannot predict the 
behavior of individuals by simply describing the 
situation. One person may tremble, sweat, expe- 
rience discomfort, and show signs of behavioral 
disorganization. Another may show an impair- 
ment in performance with no other subjective 
concomitants. Still others may show no measur- 
able effects from the situation. 

In most of the research on stress, the experi- 
menter selects a situation which, from past expe- 
rience, seems to be threatening to most people. 
Implicit in this selection is the necessity of identi- 
fying stress with the motivations of the people 
who are being tested. However, because people 
differ in motivations and in the ways they deal 
with them, it is never really possible to define a 
general stress situation. The situation will be 
more or less stressful for the individual members 
of the group, and it is likely that these differences 
in the meaning of the situation will appear in 
terms of performance. 

It is also possible to define stress by emphasiz- 
ing the reactions or responses of an individual 
rather than the situation. The trouble with this 
approach is similar to that encountered when we 
emphasize the situation. What kinds of reactions 
should we measure? Are we to consider changes 
in skilled performance as the “measure” of stress, 
or are we to consider changes in subjective report? 
It is apparent that these things may change inde- 
pendently of one another. Moreover, these 
changes are a function of many unrelated vari- 
ables. For example, skilled performance may be 
affected by a change in motivation or a change 
in approach to the task. It would be meaningless 
to identify these changes as the effects of stress. 

Since stress cannot be defined in terms of stim- 
ulus or response operations alone, it is necessary 
to think of it in terms of an intervening variable.” 


2 Since the submission of this manuscript an inter- 
esting article by Brown and Farber, entitled “Emo- 
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The additional concept that is necessary is that 
of motivation. Stress, therefore, is really a sec- 
ondary concept, built upon the relationship be- 
tween a primary concept, motivation, and the sit- 
uation in which motivated behavior appears. We 
would then think that stress occurs when a par- 
ticular situation threatens the attainment of some 
goal. The actual responses that the individual 
may show will depend partly upon the kinds of 
mechanisms that have been previously established. 

This viewpoint demands that the concept of 
motivation itself be explored. The psychologist 
who is interested in problems of human behavior 
finds it very difficult to estimate from measures 
of behavior the kind and degree of motivation 
involved in a particular situation. There is gen- 
eral agreement among psychologists that it is ulti- 
mately essential to do this in order to account for 
the enormous individual differences that are found 
in behavior. In studies of psychological stress 
individual differences tend to be one of the main 
findings. 

Pointing out the parallel between the problems 
of psychological stress and those of physiological 
stress will illustrate the difficulty in dealing with 
motivation. Physiological stress does not seem 
to involve the same definitional problems that 
psychological stress does, because the “motiva- 
tional component” in physiological stress is stated 
in terms of the well-worked-out mechanisms of 
homeostasis. Selye (34) has defined physiologi- 
cal stress as any condition that produces the 
“adaptation-syndrome,” which is the reaction of 
the organism in returning to the homeostatic state. 

The psychologist has no adequate way of defin- 
ing the psychological condition that corresponds 
to the homeostatic steady state. Consequently, 
the use of the term stress must necessarily be a 
little looser than we would like it to be. When 
we speak of tension-systems, what we are really 
doing is postulating a psychological steady-state 
as a lack of tension. What needs to be investi- 
gated are the properties of such a state and devia- 
tions from it. 

The solution of most experimenters who have 
studied the responses of groups under stress has 
been to produce situations which are thought to 


tions conceptualized as intervening variables with sug- 
gestions toward a theory of frustration,” has appe: 

in this Journal (November 1951, 48, 465-495). In 
it Brown and Farber offer some opinions which ap- 
pear to be parallel with, and certainly related to, our 
theoretical discussion of psychological stress. 
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thwart the motives of most people. This is an 
adequate solution as long as one is not attempting 
to account for the reactions of any individual. If 
the experimenter tries to account for a particular 
individual’s response, this assumption is not satis- 
factory. 


EXPERIMENTAL PROCEDURES 
FOR PRODUCING STRESS 


The principal problem in the study of behavior 
under stress has been the production of realistic 
stress situations. A variety of techniques have 
been tried. Indeed, it might be said that no two 
experimental studies in the literature exactly du- 
plicate the same technique. This variety of 
method has led to considerable confusion, since 
it is likely that each of these techniques has a 
somewhat different effect upon performance. It 
is important to review the main techniques that 
have been used. These techniques fall into two 
main classes: (a) stress induced through failure, 
and (b) stress induced by the task itself. 

Stress Induced by Failure —Failure or threat 
of failure at a task has been the method most fre- 
quently used in experiments on stress. This has 
been specifically done in the following ways: 

1, By presenting the subject with an unsolvable 
task. In this procedure the subject must work 
at a task that, unknown to him, cannot be solved. 
There are a number of ways in which this condi- 
tion may be established. For example, anagrams 
for which there are no solutions may be mixed 
with a group of solvable ones. Or, in another 
case, a subject may be required to retain a list 
of digits which is beyond any individual's memory 
span. 

2. In another type of failure-stress, the subject 
may be interrupted at the task before he could 
possibly have finished. The task may consist of 
a group of arithmetic problems. Before he has 
finished all of the problems, he is interrupted, and 
told that his time is up. 

3. A common technique for the production of 
failure-stress is the introduction of false norms 
which indicate failure even if performance has 
been adequate. The use of this technique may 
incorporate aspects of the first two procedures. 
For example, an individual may be interrupted 
before he has completed all of the items on an 
arithmetic test, and then told by the experimenter 
that anyone who has not finished has done poorly. 
The subject may be told at the outset of the 
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experiment that anyone with normal intelligence 
(or anyone who expects to be successful in col- 
lege, or have a decent career in the Air Force, 
etc.) ought to be able to complete this task within 
the time limit. In the latter case, failure to com- 
plete the problems automatically tells the subject 
that he is doing poorly. 

These techniques of failure-stress present some 
difficulties. One of the most important experi- 
mental limitations is the problem of the control 
of the subject’s motivation. In order for a failure 
situation to be stressful it is necessary for the 
individual to be motivated to succeed, or at least 
to avoid failure. While most subjects are anxious 
to succeed, the real problem is that all subjects 
are certainly not equally anxious. Some may not 
be motivated at all. The usual assumption is that 
enough of the subjects will be sufficiently involved 
to become stressed. Some subjects will be seri- 
ously disturbed by the threat of failure, while 
others may be scarcely threatened at all. Of 
course, the effect of stress will depend upon what 
the individual expects or demands of himself. 

Unfortunately, there is no way of assessing with 
any confidence the degree of motivation of sub- 
jects. We cannot take the effect of stress on 
performance itself as an indicator of the intensity 
of motivation because the amount of motivation 
is not the only determiner of the extent and direc- 
tion of the reaction to stress. It is certainly to 
be expected that in some instances strong motiva- 
tion may produce better performance. On the 
other hand, it is also capable of producing im- 
paired performance. For the moment, the ques- 
tion of motivation in failure-stress experiments is 
an important, unsolved problem. 

A second difficulty in the failure-stress proce- 
dures is presented by the degree of realism that 
can be produced. In order to be genuinely 
stressed, subjects must be convinced that the in- 
structions and information given by the experi- 
menter are bona fide. Usually the experimenter 
trusts to his ingenuity and assumes that his in- 
structions will be accepted at face value. There 
are very few experiments which attempt to assess, 
directly or indirectly, the degree to which the sub- 
jects accept the situation as genuine. There can 
be no doubt that in most of the experimental situ- 
ations, the subjects’ reactions to the experiment 
vary from skepticism to severe ego-involvement. 

Finally, a seldom considered difficulty in the 
use of failure-scores to produce the stress is the 
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extent to which a subject will use this false in- 
formation to alter his mode of attack on the task 
at hand. Changes in behavior toward the task 
as a result of failure-information may sometimes 
have little or nothing to do with stress itself. A 
subject who has been doing an excellent job may 
be encouraged (particularly if he is adaptable) 
to give up a fruitful mode of attack on the 
grounds that it has been unsuccessful. Thorndike 
pointed out long ago that an experience of failure 
may make a subject more variable in performance 
on the basis of the information it gives him. 

Stress Induced by Working Conditions and the 
Task Itself—In addition to the experiments. in 
which stress is produced through failure, pressure 
on the subject may be induced by manipulating 
the situation in various ways so as to produce 
excessive demands upon him. Various forms of 
distractions may be included in this category. 
Almost any strong sensory input which is extrane- 
ous to the task at hand may serve as a source of 
distraction. In an experimental situation it may 
be an electric shock, noises, or flashing lights. 
Distraction can also be produced by verbal dis- 
paragement of the subject's performance by the 
experimenter. Such disparagement may serve as 
a failure-stress as well as a distraction. In some 
experiments it is impossible to assess the role of 
verbal disparagement because it is introduced 
while the subject is performing, and is therefore 
both a threat of failure and a distraction. 

We have suggested that the manipulation of the 
experimental situation along the lines of distrac- 
tions may make difficult demands on the subject, 
and that these demands then serve as sources of 
stress. Moreover, some tasks themselves appear 
to be inherently stressful for much the same rea- 
sons. The task may require the subject to attend 
to too many things at a time, or to perform too 
many operations at once. These demands may 
be increased by the experimenter as in the case 
of rapidly pacing the subject. In some ways this 
type of stress is like the situation a person faces 
when he is learning to drive a car. The stress 
here is probably less a matter of failure—although 
in some instances it could be—and more of mak- 
ing too many demands at once upon the learner. 
In other cases there is the added possibility of the 
threat of personal injury or damage to the car. 
We would class this stress situation as one induced 
by the working conditions and the task itself. 

We have contrasted these two basic types of 
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stress situations, failure-stress and task-induced 
stress, because the problem of motivation appears 
to be a little different for each of them. In the 
task-induced stress, the motivation depends pri- 
marily upon how the subject interprets the experi- 
ment. If he sees the distraction or pacing as 
“something to beat,” his effort level is apt to be 
raised. Motivation can be roughly equated for 
all subjects by carefully wording the initial in- 
structions, or can be left to vary from subject to 
subject by ambiguous information. Consider, for 
example, the effect of instructions which indicate 
that the purpose of the experiment is to see how 
well the subject can hold up under distraction as 
compared with the introduction of distraction as 
a natural part of the experiment or the task so 
that the subjects will not react to it as something 
to be overcome. Despite the methodological 
problems raised here, the problem for the experi- 
menter is simpler in the task-induced stress than 
it is in the failure-stress experiment. The failure- 
stress situations depend primarily on the produc- 
tion of a realistic threat to the subjects’ self esteem 
or to some goal-oriented behavior. The variabil- 
ity of reactions to this kind of condition must be 
considerably greater than to the situation where 
stress is induced by manipulation of the working 
conditions or by the excessive demands of the 
task. 

Evaluating the Differences in Technique.—Un- 
fortunately, we can only guess, at present, what 
the effects of the differences in the techniques of 
production of stress might be. Most of the experi- 
mental situations which have been used in the 
study of stress are not directly comparable with 
one another, It is almost impossible to generalize 
with confidence from the study of the effects of 
any one experimental stress-producing technique. 
One of the important problems for future research 
is the study of the differences in the effects of the 
various procedures for inducing stress. 


THE MEASUREMENT OF THE EFFECTS OF STRESS 


One of the chief difficulties in the design of 
stress experiments is concerned with the measure- 
ment of the effects of stress upon performance. 
It is important to measure the effects of stress 
unconfounded with any other variable, The prob- 
lem arises because it is often very difficult to get 
a measure of reaction to stress which is inde- 
pendent of the subject's ability to learn or per- 
form certain kinds of tasks. 
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For example, if the experimental problem is to 
study the effects of some kind of stress upon the 
learning of a psychomotor skill, the question 
arises, “How can we differentiate the subject’s 
‘normal’ rate of learning from his rate of learn- 
ing as it is affected by the stress condition?” ` For 
data which deal with groups of subjects there is a 
simple solution to the difficulty. It is possible, 
through control subjects, to estimate the typical 
rate of learning and compare it with the rate of 
learning of a comparable group of subjects under 
some stress condition. If the experiment is prop- 
erly designed, and the control group appropriately 
selected so that it is adequately matched with the 
experimental group, then we can observe the gen- 
eral effects of stress as a difference in the mean 
performance or a difference in the variability of 
performance between the two groups. 

The tricky problem actually arises when we are 
concerned with the performance of a particular 
individual under stress. Whenever we are en- 
gaged in predicting performance under stress from 
personality measures, it is necessary to obtain a 
measure for that particular subject. This “stress 
score” must be measured independently of the 
subject’s initial ability and his change in perform- 
ance due to fatigue or learning. Because indi- 
viduals differ so much with respect to their basic 
abilities and rates of change, some estimate of 
each subject’s performance without stress is nec- 
essary before correlational techniques may be 
used. 

It has been possible to solve this dilemma in 
three ways. The most obvious technique has 
been to test each subject twice—a pretest under 
standard conditions followed by a stress test, In 
that way, a difference score may be obtained be- 
tween performance under stress and performance 
without stress. If this difference score correlates 
with the score on the first test (as it usually does) 
then it is necessary to correct this difference meas- 
ure so that it is free from the subject’s level of 
performance on the first test. It is a statistical 
fact that subjects who do well on the first test 
tend to go down on retest, while subjects who do 
badly on the first test improve on the second test. 
This means that difference scores cannot be used 
to represent stress scores without a correction for 
regression on the first test. When this correction 
has been made, the resulting score represents the 
effect of stress upon performance for any one 
subject. 
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Another technique for obtaining individual 
stress scores is to make use of plateaus in the 
learning curves, or to select tasks for study that 
show little or no change in performance with 
training. In the case of plateaus, if a subject has 

_ attained a point in the learning of a skill where 
little or no further change in performance is ex- 
pected for some time, this level might be used 
as a reference point against which to compare his 
performance under stressful conditions. Or the 
plateau may be induced by giving a subject so 
much practice that he attains his maximum 
achievement on the task, and no further changes 
are expected. 

The trouble with plateaus is that they are sel- 
dom very dependable. Most plateaus are illusory, 
or cannot be depended upon to remain steady 
long enough to introduce the stress condition sat- 
isfactorily. In the case of the final plateau, where 
subjects have been practiced to a maximum of 
performance, this level is often really not a true 
maximum, and further practice produces a spurt 
in performance. Furthermore, this latter tech- 
nique, used by Williams (43), has the disadvan- 
tage of producing a stress situation which can 
only result in a decrement in performance if the 
subjects have reached their physiological limit. 

In summary, it may be said that it is necessary 
to provide some kind of base line or standard of 
comparison against which to evaluate the uncon- 
taminated effects of stress upon performance. 
This reference point must be made free of the 
influence of the subjects’ abilities in the skill in 
question. To do this, the technique which ap- 
pears to present least difficulty makes use of two 
test periods with each subject, one under stress 
and the other under control or standard condi- 
tions. We should point out that even this tech- 
nique is by no means ideal. Difference scores 
suffer from the difficulty of requiring two equiva- 
lent tests for the same subject. Moreover, it may 
be important which is given first, the control or 
the stress condition. 

A third approach is possible, but so impractical 
that we shall dismiss it quickly. If it were pos- 
sible to match individuals beforehand on ability 
to learn or perform the skill, then one of the pair 
could be given to the control condition and the 
other to the experimental condition. Pretests 
actually could be used to match subjects. How- 
ever, this technique has not been used because it 
would be necessary to match not only initial score, 
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but also rate of learning, approach to the task, 
motivation, etc. And besides, a good matching 
is practically unattainable, because there are a 
great number of uncontrolled variables. 


EXPERIMENTAL STUDIES OF PERFORMANCE 
UNDER STRESS 


For convenience, the experimental studies have 
been classified into those which deal with verbal 
tasks and those which deal with perceptual-motor 
tasks. This division is somewhat arbitrary, but 
follows along more or less traditional lines. It is 
possible that a more rational classification may 
arise out of further experimental work. 

The Effects of Stress upon Verbal Performance. 
—Several experimenters have studied the effects 
of stress upon intelligence, primarily verbal, in 
children. Using nine-year-old boys, Lantz (/5) 
obtained a statistically significant impairment of 
Stanford-Binet scores following a failure experi- 
ence, but no such effect after a successful experi- 
ence. An examination by Lantz of the differ- 
ential effects of this failure experience upon the 
various subtests indicated that tasks requiring 
visual or rote memory were not affected, while 
those involving reasoning or thinking suffered a 
decrement. 

Along similar lines, an interesting experiment 
by Hutt (13) differentiated between the effects of 
failure-stress upon a group of maladjusted chil- 
dren and a group of well-adjusted children. Two 
methods of administration of the Stanford-Binet 
were used: (a) an adaptive procedure, in which 
a failed item was followed by one upon which 
success was likely, and (b) the standard method, 
which begins with easy items and ends with items 
which will be failed. The adaptive method, which 
produced greater psychological support, resulted 
in higher average IQ’s for the maladjusted group. 
No difference between the two testing methods 
was found for the well-adjusted group. This study 
has considerable implication for the problem of 
stress in ordinary psychometric procedures. 

The great majority of the studies of stress and 
verbal performance show deterioration or impair- 
ment as the result of the experimental conditions. 
Alper (2) finds a decrement in production in a 
sentence-formation task as the result of failure- 
stress. In another experiment, stress induced by 
pacing and the inherent difficulty of the arithmetic 
problems produced a reduced rate of learning 
(40); pacing, in this case, produced failure in 19 
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out of every 20 trials. Impairment of digit span 
was obtained with fifth- and sixth-grade children 
when the number of digits presented was beyond 
the normal span for these children (44). 

Zeller (45), working with nonsense syllables, 
found a decrement in recall and relearning follow- 
ing an experience of failure; he accounted for this 
decrement on the basis of repression of the items 
which subjects associated with failure. His evi- 
dence for this hypothesis was that relearning and 
recall improved when the knowledge of failure 
was removed. Zeller’s study may be interpreted 
as basically a stress experiment. Zeller comments 
that the subjects’ performance progressively de- 
teriorated during the accumulation of failure. 

In another failure-stress experiment in which 
nonsense syllables were learned and recalled, Sul- 
livan (37) found that (a) success produced the 
most rapid learning and the most complete recall, 
(b) slowest learning and poorest recall followed 
failure, and (c) failure was more harmful to a 
superior group in producing impairment than suc- 
cess was facilitating. The reverse was true for an 
intellectually inferior group. 

Impairment of performance on the digit-symbol 
test was found by Williams (43) with the use of 
false failure scores. Williams was primarily inter- 
ested in validating the Rorschach test. He con- 
centrated upon the prediction of performance 
under stress by selected Rorschach indices. 

In another experiment, stress, induced by fail- 
ure to complete a task within a specified time 
limit, produced an increase in errors and variabil- 
ity in time scores of multiplication problems and 
in the learning of nonsense syllables (21). 

Postman and Bruner (30) studied recognition 
thresholds under stress using three-word sentences. 
Psychological stress produced by failure and ridi- 
cule resulted in poorer performance. Since the 
stress was administered during actual perform- 
ance, the results may have been due, in large part, 
to distraction. The authors observed that during 
failure, the prerecognition guesses of the subjects 
became extremely reckless. RR 

Despite this array of studies showing impair- 
ment, poorer performance has not always been 
found to be the principal effect of stress. Initial 
decrement but later improvement in code-learning 
was obtained as the result of stress due to pacing.® 


3 Deese, J., and Bowen, H. M. The effect of task- 
induced stress on code learning. „Unpublished Te- 
search, The Johns Hopkins University, 1950. 


Hurlock (72), on the other hand, found that 
failure results in an initial improvement and then 
a later decrement. Lazarus and Eriksen (16) 
found that the major effect of failure-stress upon 
the digit symbol test was to produce greater vari- 
ability between subjects; some subjects improved 
and others showed impairment. 

Finally we might note that there have been a 
large number of studies aimed at the analysis of 
selective recall or repression that have some bear- 
ing on the problem of the effects of stress (2, 6, 
10, 14, 24, 31, 45). Eriksen’s data (6) suggest 
that poor or good recall of incompleted tasks may 
be a function of personality variables related to 
defense mechanisms. 

The Effects of Stress upon Perceptual-Motor 
Performance.—The general picture of the effects 
of stress upon perceptual-motor performance is 
similar to the pattern found with verbal tasks. A 
large number of studies have shown impairment 
of perceptual-motor performance under stress. 
Marquart (23), with the use of visual discrimina- 
tion, concluded that frustration resulted in slow 
learning, increased rigidity, and non-adaptive be- 
havior. McClelland and Apicella (20) used card 
sorting as the experimental task and found that 
stress induced by false failure-scores resulted in 
more trials before the criterion was reached. 


‘Longer reaction times for completing pictures 


that were flashed on a screen were obtained by 
Verville (47) when subjects were given unsolvable 
tasks prior to the measurement of reaction time. 
Moreover, slower reaction times were also found 
by Verville for subjects who were required to 
solve a number of difficult problems simultane- 
ously before the test of reaction time. 

Bayton and Whyte (4), who used a rate of 
manipulation test, found that a success-failure 
sequence produced poorer performance than a 
sequence of failure followed by success. Because 
they failed to include a control group, it is impos- 
sible to decide whether the differences in perform- 
ance between the conditions are due to success, 
failure, an interaction between these, or sampling 
error. 

Seashore and Bavelas (33) found a decrement 
in mental age scores on the Draw-a-Man test 
under conditions of frustration. The subjects 
were children of varying ages. In a famous ex- 
periment designed to get at the relation between 
frustration and regression, Barker, Dembo, and 
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Lewin (3) demonstrated a regression in mental 
age of young children as a result of stress. 

One of the most intensive attempts to evaluate 
the use of stress as a psychometric device for per- 
sonnel selection was undertaken by the Aviation 
Psychology Program of the Army Air Forces 
(27). With the use of various psychomotor tasks 
such as steadiness, aiming, etc., the effects of 
verbal threat and distraction upon performance 
were measured. Since this program was primarily 
concerned with prediction of success in training 
schools, very little systematic information con- 
cerning the direct effects of stress upon the meas- 
ured performance is available. Many of the 
studies were conducted without adequate control 
groups. However, in our analysis of these data 
(27), the following facts seem to be revealed: 
(a) Where it can be established that stress pro- 
duced an effect, the effect seemed to be a small 
decrement in performance; (b) an adaptation- 
effect to the stress condition is suggested by the 
fact that the stress tests were the only psycho- 
motor tests which produced different results ac- 
cording to their position within the batteries of 
tests. The scores for the stress tests were poorest 
when they occurred first in the battery. 

In psychomotor performance, as well as in ver- 
bal performance, impairment of ability has not 
always been found to be the only result of stress. 
Lindsley (78) found that stress induced by rapid 
pacing caused the subjects to attempt more prob- 
lems, which resulted in a greater number of errors 
and more variability between subjects. Total 
scores remained approximately the same because 
the increased speed was offset by an increase in 
the number of errors. Further data in the type 
of experiment described by Lindsley were ob- 
tained by McKinney et al. (22). Two kinds of 
Teactions to stress were found: most subjects 
speeded up performance at the expense of accu- 
racy and were more variable, while a few sub- 
jects showed a stable performance without any 
increase of speed or errors. As a whole, the 
stressed subjects showed a reduction of efficiency 
in performance in terms of the percentage of cor- 
rect items to the number attempted. Clark and 
McClelland * reported that subjects with mild fear 
of failure performed better, particularly towards 


4Clark, R. A., and McClelland, D. C. A factor 
analytic integration of imaginative, performance and 
case study measures of the need for achievement. 
Unpublished research, cited in McClelland (19), 
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the end of a test. Finally, improved performance 
under failure-stress was found by Gates and Riss- 
land (9) with the use of motor coordination. and 
color naming tasks. 

The Effects of Stress upon the Components of 
Performance.—Most of the studies of stress re- 
port findings concerning the total scores on the 
performance of some task. There is usually no 
attempt made to analyze these total scores into 
their various components and to observe the ef- 
fects of stress upon them. Two subjects could 
obtain identical scores on a task as the result of 
very different patterns of performance. This is 
especially true where scores may be analyzed in 
terms of speed and accuracy and where an in- 
crease in speed may go along with a decrease in 
accuracy. A report in terms of total scores may 
obscure what is really happening to the subject’s 
performance. 

Only two studies to date have presented data 
relevant to this problem. In both of these studies 
an increase in errors was accompanied by an 
increase in speed as a result of stress (16, 18). 
In one of these studies, however, speed accounted 
for so much of the variance of the total scores 
that any increase in errors was ineffective in 
changing the total score (76). It is probable 
that the effects of stress upon such components 
as speed and accuracy may be very different from 
task to task. 

Adaptability to the experimental conditions and 
various other less easily measured aspects of per- 
formance may also be differentially affected by 
stress. One prcblem in this area is the difficulty 
of determining which measure represents the sub- 
ject’s efficiency in any particular task. In some 
situations accuracy may be of greater importance 
than speed, while in others rapidity of perform- 
ance may be the important feature. 

Qualitative Observations of Performance under 
Stress —In connection with experimental studies 
of the effects of stress upon performance, many 
qualitative changes in behavior have been noted. 
Stereotyped responses (23), inattention (44), dis- 
organized activity (36), and increased overt ac- 
tivity (8) have been some of the aspects of be- 
havior under stress that have been observed. 
Many experimenters have reported signs of emo- 
tional upset such as sweating, tremor, subjective 
anxiety, pulse changes, etc. A whole new theory 
of the physiology of the organism (34) has grown 
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up around the organic changes observed as the 
result of physiological and psychological stress. 

Investigators have attempted to make field ob- 
servations on the effects of stress upon perform- 
ance by means of questionnaires, interviews, and 
rating scales. These studies have the merit that 
they attempt to discover the effects of non-labora- 
tory stresses upon behavior. However, they have 
the great disadvantage of depending entirely upon 
self-report techniques. The most serious difficulty 
with these studies is that they do not include 
actual, objective measurements of performance. 
The questionnaires are always administered after 
the stressful situation has passed away, sometimes 
years later (5). 

Dollard (5) administered a series of question- 
naires on fear in battle to veterans of the Abra- 
ham Lincoln Brigade of the Spanish Civil War. 
These questfonnaires suffered from too many limi- 
tations to be of much use. The techniques of 
sampling, question-wording, the nature of the at- 
titudes measured, etc., make this study difficult 
to evaluate. The data do tell us something about 
the conditions which produce stress in actual com- 
bat and some of the personal reactions to these 
stresses. However, there is no way of knowing 
how the intense fears reported by Dollard’s sub- 
jects affected their performance. 

A somewhat more satisfactory questionnaire 
approach was attempted by Shaffer (35), with 
the use of veteran AAF flying personnel. Follow- 
ing an analysis of the results of his study, Shaffer 
(35) concluded that the adequate stimulus for 
fear is a highly motivated situation towards which 
the individual has no adequate means of adjust- 
ment. This definition seems to fit most of the 
experimental studies of behavior under stress. 

Shaffer reported many different signs of emo- 
tional upset in combat. Some of his data sug- 
gested that the severe stress of battle may, indeed, 
have a serious detrimental effect upon the various 
skills required in combat. Subjective feelings of 
fear increased with the number of missions that 
were flown, though the pilots questioned believed 
that this was probably not accompanied by an 
increase in the deterioration of behavior. Some- 
what greater faith can be placed in the informa- 
tion from this report because of the cross-checks, 
the closer proximity in time of the interviews to 
the experience in question, and the somewhat 
more adequate control of sampling. Neverthe- 
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less, Shaffer’s study still suffers from a complete 
dependence upon self-report. 

The studies of Shaffer are the only field ob- 
servations which actually attempted to assess 
the effects of combat stress on performance 
through self-report. The statement of a majority 
of the men questioned was that fear increased 
their efficiency or at worst made no difference 
in it. Only from 17 to 31 per cent of the men 
in the various groups interviewed reported that 
fear decreased the efficiency of their performance. 
Those men who reported an increased efficiency 
as the result of stress said that mild fear in- 
creased their efficiency more than intense fear. 

The findings of this study are extremely inter- 
esting and provocative, but there is a distinct 
probability that the reports of the men inter- 
viewed do not reflect the actual state of their per- 
formance under stress. It is likely that the inter- 
viewees confused effort with performance. Also, 
under the circumstances of testing, men might 
be extremely unwilling to admit behavioral dis- 
organization under combat. The individuals who 
were interviewed had no good criteria for an 
evaluation of their own performance under stress. 
The stressful conditions themselves probably re- 
duced the accuracy of self-estimates of perform- 
ance. As a result, we suspect that the percentage 
of men experiencing no detrimental effects of 
stress was greatly overestimated. In light of the 
experimental studies on performance under stress, 
it does seem likely that some individuals will show 
a facilitation of performance under such condi- 
tions, but it also seems probable that these indi- 
viduals will be in the minority in any randomly 
chosen sample of people. This interesting prob- 
lem remains to be studied by more reliable 
methods. 

With the use of narco-analytic techniques, 
Grinker and Spiegel (77) studied battle-fatigued 
soldiers of the last war. They suggested that a 
primary mechanism for psychological breakdown 
under combat involves strong guilt feelings and 
conflict over supposed inadequacy and cowardly 
behavior during combat. While it is provocative, 
this study is less relevant here because all of the 
subjects involved were psychiatric patients. 

Personality Correlates of Behavior under Stress. 
—Very little information has been obtained about 
the relationship between various measures of per- 
sonality and reaction to stress. The problem has 
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theoretical as well as practical importance. On 
the one hand, while great individual differences 
in response to stress have been recognized, few 
fruitful attempts have been made to discover their 
nature. On the other hand, it would be most 
useful to be able to predict which people will be 
adversely affected by a stressful situation. 

Most of the studies which have obtained rela- 
tionships between performance under stress and 
measures of various aspects of personality have 
presented correlations which are statistically sig- 
nificant but are too small to be of practical value. 
For example, Taylor and Farber (39) found that 
submissive children showed a decrement in per- 
formance and an increase in variability under 
failure-stress, while ascendant children showed 
an improvement. In Hutt’s (/3) study, previ- 
ously cited, maladjusted children showed im- 
paired mental age scores on a stressful administra- 
tion of the Stanford-Binet. Lazarus and Eriksen 
(16) have demonstrated that students with high 
grade-point averages in college tend to improve 
under stress while poor students show a decre- 
ment and greater variability in performance when 
ACE scores are held constant. The authors ac- 
count for this relationship by the suggestion that 
some poor students may obtain poor grades be- 
cause of the stressful nature of college examina- 
tions. 

In a study that is difficult to interpret, Meadow 
(25) found that women with low dominance 
feelings were more emotionally disturbed by fail- 
ure than women with high dominance. In this 
experiment emotional disturbance was measured 
by subjective report; the author found that women 
emotionally disturbed by the failure showed 
poorer performance in arithmetic and memory 
tests for emotional words than did those women 
who reported no emotional disturbance. It is 
impossible to tell to what extent the subjects’ 
reports of disturbance may have been influenced 
by their actual level of performance. Individuals 
who showed poor performance may have at- 
tempted to rationalize this by claiming to be 
emotionally upset during the test. The study, 
however, is in accord with the results found by 
Taylor and Farber (39). 

The only study that has reported relationships 
between performance under stress and personality 
variables of sufficient magnitude to be of practical 
use is that of Williams (43). In a validation 
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study of certain variables on the Rorschach test, 
Williams reported a multiple correlation coeffi- 
cient of .824 between average decrement under 
stress and two scoring categories on the Ror- 
schach. There are several features about this 
study, however, which suggest caution in accept- 
ing the results, In the first place, the number 
of cases upon which the correlations were based 
was only 25. There are strong possibilities of 
sampling errors with so few cases. In the second 
place, the material used was over-learned, so that 
the scores for the subjects under stress could 
only decrease. Thus, nothing but “decrement” 
scores were obtained. In addition, it is to be 
noted that the stress was produced partly by the 
distracting influence of the threat of electric shock. 

The doubts concerning the generality of the 
results of this study are supported by negative 
findings in a study correlating a large series of 
Rorschach variables with the effects of stress upon 
performance (7). 

Also on the negative side is a study by Adams 
(1), who found no relationship between neurotic 
tendency scores on the Bernreuter Personality 
Inventory and performance under failure-stress. 
In a similar experiment using the Bernreuter, 
Marquart (23) also obtained negative results. 

McKinney et al. (22) found no relationship 
between attitudes and feelings expressed on a 
questionnaire and efficiency of performance under 
stress. Some slight positive relationships are 
suggested, however, with variability in a few 
Rorschach responses. However, the measure of 
variability of these Rorschach responses is not 
described in the experimental report. 

Relevant to the problem of the personality 
correlates of performance under stress is an ex- 
periment by Taylor (38). She found that indi- 
viduals who show high anxiety as measured by a 
questionnaire are people who learn a conditioned 
eyelid response most rapidly. Since the condi- 
tioned eyelid response is of the avoidance type, 
the author concludes that this superiority is the 
result of a high anxiety drive necessary for rapid 
avoidance conditioning. 

Performance under Stress as a Predictor.— 
There have been two systematic attempts to use 
performance under stress as a predictor of suc- 
cess in other dangerous and stressful activity. 
One was the well-known OSS study, and the 
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other was an attempt by the Aviation Psychology 
Program to design stress tests. 

The more elaborate of these experiments was 
the program conducted by the Office of Strategic 
Services to select men who would be most effec- 
tive in intelligence operations in enemy territory 
during World War II (46). The basic procedure 
was to subject a group of applicants to a series 
of stress tests designed by the psychological staff 
of the OSS. The test situations were highly 
realistic in the sense that they did a good job of 
simulating stressful situations to which the can- 
didate was liable to be subjected while on duty. 
The men were aware that they were being evalu- 
ated. The stress that was created was basically 
a failure-stress. All measurements of perform- 
ance were based upon ratings by continually ob- 
serving psychologists. While large individual 
differences were found by the raters, these were 
not successfully related to later performance in 
the field. A possible reason for this was that 
the various subjects in this experimental program 
served in different theaters of war and performed 
different functions, so that the criterion of per- 
formance was not comparable. The OSS study 
cannot be considered to have achieved successful 
prediction of performance under stress, despite 
its elaborate design. 

The Aviation Psychology Program (27) made 
an attempt to relate performance on a series of 
psychomotor tests given under stressful condi- 
tions to success in pilot training, bombardier 
training, and navigator training. No significant 
relationships were found between any one of the 
five tests used and any available measure of suc- 
cess in training school. Several major criticisms 
of this study may be given. First, the criteria 
used were poor ones inasmuch as they were very 
heavily saturated with semi-academic abilities. 
Stress may actually lower the validity of a test 
when the criterion is success in school. More- 
over, performance in training school may be 
poorly related to performance under operational 
conditions that involve stress. In support of this 
last statement is a study of the validity of selec- 
tion and classification tests in the theaters of war. 
The conclusion was drawn that many of the most 
important features of combat performance were 
not measured by the traditional abilities tests 
(17). Second, the most important factors meas- 
ured by these AAF stress tests may not be ability 
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to perform well under stress but the specific 
motor skills called for by the task itself, No at- 
tempt was made to measure deterioration under 
stress, or any other variable that would yield a 
picture of the effects of stress independent of the 
individual’s ability. This is a problem of experi- 
mental design that we have discussed earlier. 

It should be mentioned also that attempts to 
use the selection and classification battery of the 
AAF to predict the syndrome known as “anxiety- 
reaction” or “operational fatigue” also resulted 
in failure (42). 


AN EVALUATION OF THE PROBLEM OF THE 
EFFECTS OF STRESS 


The stress experiments mentioned earlier have 
yielded varying results: some show only a decre- 
ment in performance, others show improvement, 
and still others produce both of these effects for 
different individuals (an increase in variability). 
It is important to understand why the results 
have been seemingly inconsistent. 

The distressing fact is that few of the experi- 
ments are comparable because they employed 
different kinds of conditions to produce stress 
and different kinds of tasks. The question of the 
dissimilarity of the two large classes of stressful 
conditions, failure-stress and task-induced stress, 
has already been raised. No data are available 
at present that allow us to make any generaliza- 
tion about this question. 

Another important problem concerning the 
comparability of the various experiments cited 
is that of individual differences and sampling. 
Various investigators have used children, adults, 
women, men, college students, and military per- 
sonnel. Inasmuch as large individual differences 
have been found in all of the experiments, it is 
reasonable to suppose that some of the disagree- 
ments between these experiments are due to dif- 
ferences in the samples employed. All in all, 
the body of experimental literature on the topic 
of the effects of stress on performance is com- 
pletely unsystematic. 

An integrated theoretical picture about the 
effects of stress upon performance must take 
account of individual differences, the finding of 
impairment as well as improvement of perform- 
ance, the influence of different situations, and the 
effects of different kinds and amounts of stress. 
All of the psychological concepts about stress 
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have been motivational in frame of reference. 
For convenience, we may categorize the explana- 
tory concepts into those which emphasize primar- 
ily the energizing and directive aspects of motiva- 
tion and those which emphasize the emotional 
aspects of motivation. 

The Energizing Aspect of Motivation. —Miller 
(28) and Wickert (42) have emphasized fear as 
a motivation or drive. Since an increase in 
motivation is usually accompanied by an in- 
creased output in performance, Miller and 
Wickert both suggest that fear, produced in a 
stressful situation, may actually be beneficial to 
performance. They cite as evidence the fact 
that AAF personnel reported that under the stress 
of combat they performed more efficiently (42). 

This simple notion of fear as a drive may well 
account for the increased efficiency that one finds 
occasionally for some individuals under stressful 
conditions. Few people would argue with the 
proposition that motivation does increase the ade- 
quacy of performance. It is apparent that fear 
produced by a stressful situation has a consider- 
able motivational component. 

The difficulty arises because of the fact that, 
in some instances, high degrees of motivation or 
fear seem to produce a decrement or impairment 
of performance. Neither Miller nor Wickert 
takes this into account. It is therefore necessary 
to look for mechanisms that may supplement the 
notion of the simple, energizing function of moti- 
vation. It is possible to think of a critical point 
in the amount of fear, beyond which disruption 
occurs. 

The Directive Aspects of Motivation.—In addi- 
tion to producing the impetus to action and the 
persistence of action, motivation has a directive 
aspect and a terminating effect (26). This means 
that an individual will direct his efforts towards 
whatever operations tend to be satisfying. When 
the motives have been satisfied, the activity in 
this direction will be terminated. In extremely 
powerful motivational states, it is very possible 
that a subject’s activity will be unadaptively di- 
rected. For example, Lindsley (78) and Lazarus 
and Eriksen (16) showed that one of the effects 
of a threat of failure was an increase in the speed 
of performance as well as an increase in the 
number of errors. It is quite possible that a con- 
siderable portion of the decrements in performance 
reported in other experiments may be due to an 
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inappropriate increase in speed of performance. 

Another illustration of unadaptive behavior as 
the result of stress comes from an experiment by 
Patrick (29), who found irrational and repetitive 
behavior in human subjects who were placed in 
a problem box and subjected to severe stress. 

In many of the experiments in which failure- 
stress was used, it is possible that telling an in- 
telligent subject that he has done poorly will 
force him to alter his mode of attack, so that it 
may be less effective. This kind of an effect could 
account, at least in part, for impairment found 
in some of the experimental studies. 

An additional way in which the directive as- 
pects of motivation could produce a decrement in 
performance is the arousal, by threat, of motives 
that are antithetical to the activities required by 
the task itself. For example, Rosenzweig (32) 
suggests that reactions to frustration may be need- 
persistent or ego-defensive. Need-persistence 
means that the individual will center his attention 
and efforts upon the frustrated need (goal- 
oriented behavior), and ego-defense means that 
the individual will be primarily concerned with 
the maintenance of self-esteem. 

There are many instances in which activity 
directed toward the maintenance of self-esteem 
may make the effective continuance of perform- 
ance difficult or impossible. Some writers have 
referred-to this kind of situation as stimulating the 
individual to leave the field psychologically. The 
ego-defensive aspects of the situation may be- 
come so important to the individual that he ceases 
or reduces his attack on the task at hand, For 
example, a subject in an experiment may stop 
work so that, even if he fails in the eyes of the 
experimenter, he can justify his failure to him- 
self on the grounds that he really didn’t try. The 
result of this defensive mechanism is most cer- 
tainly to reduce the person’s effort level toward 
the experimental task. 

Thus there are other motives that may inter- 
fere with the individual’s need to work at and 
do his best on the task at hand. In situations 
that threaten physical danger, the motivation to 
escape may directly interfere with the individual's 
performance. His continued performance may 
be, or seem to be, incompatible with self-preserva- 
tion. A conflict may be produced between the 
desire for self-preservation and the desire to peT- 
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form well. This conflict could result in impair- 
ment of the individual’s level of performance. 

The Emotional Aspects of Motivation. —Emo- 
tional reactions usually accompany very power- 
ful motivation. There is no need to elaborate 
upon the autonomic components of emotion. The 
disruptive aspect of emotion, of which the auto- 
nomic components are perhaps the most char- 
acteristic, may produce mental blocks, tremors, 
severe anxiety-reactions that make the satisfac- 
tory completion of a given task difficult or im- 
possible. Anxiety-reactions may include such 
effects as nausea, fainting, spasms, weakness, 
headaches, dissociation, etc. Clinicians have long 
been aware that the ability of people to retain 
lists of digits may be severely impaired by the 
presence of strong anxiety. This recognition finds 
expression in the diagnostic use of the digit-span 
test of the Wechsler-Bellevue Intelligence Scale. 

One of the effects of anxiety might be to pro- 
duce a powerful distraction. Threatened sub- 
jects frequently report that their productive think- 
ing is disrupted by the compelling preoccupation 
with the thought of the consequences of failure 
or danger. We might consider that in some tasks, 
e.g., those that require fairly automatic responses, 
this preoccupation would have little effect, where- 
as in others, e.g., those that require concentra- 
tion, this preoccupation would be very disrupting. 

In the simplest sense, the autonomic overflow 
in strong emotion would make physical perform- 
ance difficult or impossible. The individual who 
collapses before going into combat is protected 
against the dangers of combat. The person who 
develops writer’s cramp is temporarily removed 
from the threatening examination. It is difficult 
to get adequate subjective reports on the milder 
effects of autonomic disturbance that may inter- 
fere with an individual’s performance, because 
subjects may magnify these effects in order to 
make use of them as ego-defensive mechanisms. 
It is possible, for example, that mild autonomic 
reactions may be accompanied by an improve- 
ment in performance, even though the individual 
may insist that he is too nervous to perform 
efficiently. 

The cause of these disruptive influences prob- 
ably lies directly with the fear itself, though the 
impairment may be enhanced by the presence of 
conflict between the anxiety produced. by the 
threat of danger and the necessity of remaining 
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in the situation and seeing the task through. The 
disruption may also be increased by the inability 
of individuals to cope with the fear-producing 
stimuli, It is possible that situations in which 
the individual can remove or reduce the stimuli 
that threaten him may not be so stressful. 

Interaction of Emotion and Motivation.— 
When we try to analyze what is happening in 
any individual’s performance under stress, it is 
usually almost impossible to separate the effects 
due to the operation of emotional disruption from 
those due to the directive characteristics of mo- 
tivation. Some subjects in an experimental sit- 
uation may show impairment in performance 
because of ego-defensive reactions. Some may 
suffer primarily because of emotional disturb- 
ance, Others may be facilitated. Still other 
cases may involve the operation of both facilitat- 
ing and disrupting motivational and emotional 
reactions. In fact these two may balance in any 
individual to produce no effect upon any measure 
of performance. 

These interactions present a major problem 
for the prediction of reactions to stress. The 
same effect upon performance may be achieved 
via several different routes. Some of the negative 
findings with respect to the relationships between 
personality variables and performance under 
stress may be due to the assumption by many 
psychologists that impairment is solely related to 
the lack of emotional control. Because of the 
multiplicity of causes for any individual’s par- 
ticular reaction to stress, it is not surprising that 
the scores on inventories of neuroticism and other 
measures of emotional stability show uegligible 
correlations with performance under stress. 

Motivation, Emotion, and the Kind of Stress- 
Situation —We have pointed out that the type 
of motivation, that is, ego-defensive or need- 
persistent, the kind of approach a subject makes 
to the situation, and the effects of strong emotion 
are all possible bases for an individual’s response 
to the stress situation. It is necessary to point 
out, however, that the importance of any of these 
factors will depend, to a large extent, upon the 
conditions of the situation and the type of stress 
involved, as much as upon the personality and 
past experience of the individual. In other words, 
ego-defensive reactions are apt to be especially 
prominent in situations that utilize failure as the 
source of the stress. On the other hand, distrac- 


300 
tion situations, which are not accompanied by 
threats of failure or disparagement, are less apt 
to make ego-defensive reactions necessary. 

Essentially, this boils down to a consideration 
of interactions between persons and types of 
stress. It would be interesting to know what 
kind of individual develops anxiety reactions to 
task-induced stress. We might guess that such 
people are highly motivated to perform well. The 
successful understanding of any individual’s per- 
formance under stress depends upon some way of 
measuring the kinds and strength of his motiva- 
tions and relating them to the characteristics of 
the situation in which he must perform. The 
fulfillment of this aim is, indeed, no simple affair. 

Task Components and the Interactions of Emo- 
tion and Motivation—As we have said earlier, 
relatively little attention has been paid to the 
study of the components of the total performance 
of persons in stressful situations. What is going 
on, in terms of the effects on the subject's per- 
formance, may be masked by attention to only 
the total score. This score is simply the end 
product of his attack on the problem, his degree 
and direction of effort, the effects of emotional 
disturbance, and his abilities. 

A consideration of the various components of 
performance may well yield a clue as to the bases 
of the observed effects of stress upon perform- 
ance. For example, periodic blocking in a con- 
tinuous task could indicate some degree of emo- 
tional disturbance in one subject. These block- 
ings might adversely affect speed and regularity 
and, therefore, result in an impaired performance. 
On the other hand, in the same situation, another 
subject might show a greatly increased rate of 
performance at the expense of accuracy, which 
would suggest that the energizing and directing 
aspects of motivation are involved. Both sub- 
jects end up with the same total score, for dif- 
ferent reasons, and attention to the components 
of that score would provide some leads about 
what those reasons might be. 

Relating the individual components of per- 
formance to possible causes of the stress effects 
would not be simple. In many instances the 
relationship may be ambiguous. For example, 
excess overt activity which has occasionally been 
reported as an effect of stress would be difficult 
to assign to any of these possible causes. Impair- 
ment of judgment might reflect emotional block- 
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ing or misguided effort as a result of false failure- 
information. Some help would be obtained, how- 
ever, by studying the combination of effects, both 
qualitative and quantitative. Despite the diffi- 
eulties inherent in the approach we have been 
suggesting, an attack on the source of the various 
stress effects and how they operate must be made 
before we can begin to understand the bases of 
the effects of stress upon skilled performance. 
It is our belief that the various individual com- 
ponents of performance can be more successfully 
related to personality variables than the total 
score, 

Implications for Future Research.—The most 
obvious questions that require an answer are: 

1. How do the effects of psychological stress 
vary with the nature of the task? 

2. What are the differences in effects produced 
by failure-stress and task-induced stress? 

3. How does psychological stress affect the 
various individual components of total perform- 
ance? 

4. What is the relationship between personality 
variables, past history, and performance under 
stress? 

5. What is the relation between the capacity 
of the subject and reaction to stress? 

6. What relation do emotion and motivation 
have to performance under stress? 

Research in the area of psychological stress 
must begin practically afresh. Any systematic 
program must take into account the difficulty of 
producing realistic stress situations and making 
effective measurements of the stress effects which 
are independent of the skills required by the task 
itself. It seems certain that a really systematic 
attack on the problems of psychological stress 
will produce some valuable answers to both the- 
oretical and practical problems in behavior. 
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SECTION VIII 


Deviant 
Behavior 
and Its 


Treatment 


A procedure frequently followed in personality 
research is to (1) select two or more groups of 
normal subjects differing along certain dimen- 
sions, and (2) to compare the behaviors of these 
groups under specifiable controlled conditions. 
Many studies in this collection employ such a 
methodology. Many students of personality, par- 
ticularly those working in the clinical fields, em- 
phasize the need for analysis of deviant or ab- 
normal behavior as a means of ultimately better 
understanding personality at all levels of func- 
tioning. An additional reason for increasing in- 
terest in the attempt to better understand deviancy 
in personality is the pressing demand placed on 
persons involved in clinical work to cope with 
the mental health problem. The papers in this 
section indicate some of the ways in which ab- 
normal behavior and its treatment are being 
studied and discussed. 

The desirability of close contact between stu- 
dents of personality and those concerned with 
other experimental areas of psychology, which 
has already been suggested, is further illustrated 
in the conceptual paper presented by Ferster. 
Using several ideas, originally developed within 
the context of the analysis of learning variables, 
he has attempted to account, developmentally, 
for the manifestation of autism in childhood 
schizophrenia. It is interesting that much of the 
evidence pertaining to the concept of reinforce- 
ment, on which Ferster relies heavily, stems from 
experimentation with animals under various con- 
ditions of physiological motivation. The clarity 
with which Ferster outlines his hypotheses con- 
cerning autistic behavior will help provide a basis 
for an empirical, functional approach to behavior 
disorders. 

The article by Schofield and Balian, dealing 
with schizophrenic behavior in adults, is a doubly 
provocative one because of its contribution to the 
study of normal as well as to abnormal develop- 
ment. In their study, interviews with schizo- 
phrenic and non-psychiatric patients were ob- 
tained, categorized, and compared with respect to 
life history factors. One of their findings, that 
many of their normal subjects presented histories 
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which include psychologically traumatic incidents, 
gives one some pause when blithely referring to 
a “normal control group.” Another finding, that 
schizophrenic behavior does not seem to be a 
function of certain isolated traumatic events or 
conditions, suggests that emphasis on develop- 
mental patterns rather than on particular stress- 
ful incidents might lead to a better rounded view 
of psychopathological phenomena. 

Complementary to the study of historical vari- 
ables related to present behavior is the intensive 
study of present behavior itself. Osgood and 
Walker, in the research reported in the third 
article of this section, selected for study the very 
meaningful behavioral events of suicidal notes. 
Using a number of linguistic analyses suggested 
by motivational theory, Osgood and Walker 
analyze suicidal and non-suicidal notes as samples 
of verbal behavior. It seems very likely that 
methods such as those used by Osgood and 
Walker will become important tools in the study 
of verbal behavior as it applies to personality. 

Two of the articles in this section deal with 
problems associated with the psychological treat- 
ment of behavior disorders through verbal tech- 
niques. In the first of these, Raimy applies con- 
tent analysis procedures to psychotherapeutic in- 
terviews in order to determine whether or not the 
content analyses would be sensitive to a judged 
improvement over the interview series. Work- 
ing within the theoretical framework offered by 
Carl Rogers, Raimy specifically hypothesized that 
groups of patients judged to have been either 
successfully or unsuccessfully treated would differ 
with respect to self-attributes and self-references 
expressed during counseling sessions. Raimy’s 
results support this hypothesis and suggest that 
the self-concept can be a useful one in describ- 
ing personality changes as a function of psycho- 
therapeutic intervention. 

While the aim of any therapy is to improve the 
patient, a number of problems arise particularly 
with respect to psychotherapy which must be 
solved, at least partially, if meaningful statements 
are to be made concerning ‘the therapy's efficacy. 
What constitutes change and improvement in the 
individual? Within which sorts of contexts does 
therapy seem to be most effective? Frank, in his 
article, presents a comprehensive survey of prob- 
lems besetting the researcher interested in psycho- 
therapy. Generalizing from his research at Phipps 
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Clinic, Frank is especially concerned with the 
problem of the definition of the control group in 
psychotherapy and its influence on the inferences 
which might be drawn from research on psycho- 
therapy. From his article, the reader obtains both 
a sound approach to the problem of controls in 
scientific investigations and a heightened respect 
for the complexity of the problem of personality 
change. 


POSITIVE REINFORCEMENT 
AND BEHAVIORAL DEFICITS 
OF AUTISTIC CHILDREN * 


C. B. FERSTER 


Infantile autism, first described by Kanner (6), 
is a very severe psychotic disturbance occurring 
in children as young as two years. At least out 
wardly, this childhood schizophrenia is a model 
of adult schizophrenia. Speech and control by 
the social environment are limited or absent; tan- 
trums and atavistic behaviors are frequent and 
of high intensity; and most activities are of a 
simple sort, such as smearing, lying about, rub- 
bing a surface, playing with a finger, and so forth. 
Infantile autism is a relatively rare form of schiz- 
ophrenia, and is not important from an epi- 
demiological point of view. The analysis of the 
autistic child may be of theoretical use, however, 
since his psychosis may be a prototype of the 
adult’s; but the causal factors could not be so 
complicated, because of the briefer environmental 
history. In this paper, I would like to analyze 
how the basic variables determining the child’s 
behavior might operate to produce the particular 
kinds of behavioral deficits seen in the autistic 
child To analyze the autistic child’s behavioral 
deficits, I shall proceed from the general prin- 
ciples of behavior, derived from a variety of spe- 
cies, which describe the kinds of factors that 
alter the frequency of any arbitrary act (10), (3). 


* Reprinted by permission from Child Develop- 
ment, September, 1961, Vol. 32, No. 3, 437-456. 
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The general principles of behavior applied to the 
specific situations presumably present during the 
child’s developmental period will lead to hy- 
potheses as to specific factors in the autistic child’s 
home life which could produce the severe changes 
in frequency as well as in the form of his be- 
havior, As an example, consider the effect of 
intermittent reinforcement, many of the proper- 
ties of which are comparatively well known from 
animal experiments. To find how intermittent 
reinforcement of the autistic child’s behavior 
might produce deficits, we would first determine, 
in the general case, what specific orders of mag- 
nitude and kinds of schedules produce weakened 
behavioral repertoires. The factors in the child’s 
home life could be examined to determine esti- 
mates of what kind of circumstances could con- 
ceivably cause schedules of reinforcement capable 
of the required attenuation of the child’s be- 
havior. The analysis will emphasize the child’s 
performance as it is changed by, and affected in, 
social and nonsocial environment. As in most 
problems of human behavior, the major datum 
is the frequency of occurrence of the child’s be- 
havior. Although the account of the autistic 
child’s development and performance is not de- 
rived by manipulative experiments, it may still 
be useful to the extent that all of the terms of 
the analysis refer to potentially manipulable con- 
ditions in the child’s environment and directly 
measurable aspects of his performance. Such an 
analysis is even more useful if the performances 
and their effects on the environment were de- 
scribed in the same general terms used in system- 
atic accounts of behavior of experimental psy- 
chology. 

Some of our knowledge of the autistic child’s 
repertoire must necessarily come from anecdotal 
accounts of the child’s performance through di- 
rect observation. Although such data are not 
so useful as data from controlled experiments, 
they can be relatively objective if these perform- 
ances are directly observable and potentially 
manipulable. A limited amount of experimental 
knowledge of the dynamics of the autistie child’s 
repertoire is available through a program of ex- 
periments in which the autistic child has devei- 
oped a new repertoire under the control of an 
experimental environment (4). These experi- 
ments help reveal the range and dynamics of the 
autistic child’s current and potential repertoires. 
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In general, the autistic child’s behavior will be 
analyzed by the functional consequences of the 
child’s behavior rather than the specific form. 
The major attempt will be to determine what 
specific effects the autistic child’s performance 
has on that environment, and how the specific 
effects maintain the performance. 


SPECIFICATION OF THE AUTISTIC CHILD’S 
PERFORMANCE 


We must first describe the current repertoire 
of the autistic child before we can describe pos- 
sible environmental conditions that might pro- 
duce gross behavioral deficits. A topographic de- 
scription of the individual items of the autistic 
child’s repertoire would not, in general, distin- 
guish it from the repertoires of a large number 
of functioning and nonhospitalized children, ex- 
cept perhaps in the degree of loss of verbal be- 
havior. The autistic child’s behavior becomes 
unique only when the relative frequency of occur- 
rence of all of the performances in the child’s 
repertoire is considered. In general, the usual 
diagnostic categories do not adequately charac- 
terize the children in the terms of a functional 
analysis of behavior. Hospitalization of a child 
usually depends upon whether the parent can 
keep the child in the home, rather than a func- 
tional description of the role of the parental en- 
vironment in sustaining or weakening the child’s 
performance. 

Range of Performances.—Although the autis- 
tic child may have a narrower range of perform- 
ances than the normal child, the major difference 
between them is in the relative frequencies of the 
various kinds of performances. The autistic 
child does many things of a simple sort—riding 
a bicycle, climbing, walking, tugging on some- 
one’s sleeve, running, etc. Nevertheless, the au- 
tistic child spends large amounts of time sitting 
or standing quietly. Performances which have 
only simple and slight effects on the child’s en- 
vironment occur frequently, and make up a large 
percentage of the entire repertoire; for example, 
chewing on a rubber balloon, rubbing a piece of 
gum back and forth on the floor, flipping a shoe- 
lace, or turning the left hand with the right. 
Almost all of the characteristic performances of 
the autistic child may be observed in nonhos- 
pitalized children, put the main difference lies in 
the relative importance of each of these perform- 
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ances in terms of the total repertoire. Con- 
versely, isolated instances of quite “normal” per- 
formances may be seen in the autistic child. 
Again, the relative frequency of the performances 
defines the autistic child. 

Social Control Over the Child's Performance. — 
The major performance deficits of the autistic 
child are in the degree of social control: The 
kinds of performances which have their major 
effects through the mediation of other individuals. 

The main avenue of social control in a nor- 
mal repertoire is usually through speech, a kind 
of performance that is unique because it pro- 
duces the consequences maintaining it through 
the mediation of a second person (12). Autistic 
children almost always have an inadequately de- 
veloped speech repertoire, varying from mutism 
to a repertoire of a few words. Even when large 
numbers of words are emitted, the speech is not 
normal in the sense that it is not maintained by 
its effect on a social environment. When normal 
speech is present, it usually is in the form of a 
mand (12). This is a simple verbal response 
which is maintained because of its direct rein- 
forcement, e.g., “Candy!” “Let me out.” The 
main variable is usually the level of deprivation 
of the speaker. It lacks the sensitive interchange 
between the speaker and listener characteristic of 
much human verbal behavior, as for example, 
the tact (see below). The reinforcement of the 
mand largely benefits only the speaker. In the 
case of the autistic child, it frequently affects the 
listener (parent), who escapes from the aversive 
stimulus by presenting a reinforcing stimulus rele- 
vant to the child’s mand. At suppertime, the 
child stands at the door screaming loudly and 
kicking the door because the ward attendants in 
the past have taken the child to supper when this 
situation became aversive enough. Sometimes, 
the form of the mand is nonvocal, although still 
verbal, as when the mand involves tugging at a 
sleeve, pushing, or jostling. The dynamic condi- 
tions which could distort the form of a mand into 
forms most aversive to a listener will be described 
below. In contrast to the mand, the tact (12) is 
almost completely absent. This form of verbal 
behavior benefits the listener rather than the 
speaker, and is not usually relevant to the current 
deprivations of the speaker. This is the form of 
verbal behavior by which the child describes his 
environment, as, for example, “This is a chair”; 
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“The mailman is coming.” This latter kind of 
verbal control is generally absent or weak, as with 
other kinds of verbal behavior except an occa- 
sional mand. 

Atavisms.—Tantrums, self-destructive behav- 
ior, and performances generally aversive to an 
adult audience are relatively frequent in the au- 
tistic child’s repertoire. Most autistic mands de- 
pend on an aversive effect of the listener for their 
reinforcement. To the extent that social behavior 
is present at all, its major mode is through the 
production of stimuli or situations which are aver- 
sive enough so that the relevant audience will 
escape or avoid the aversive stimulus (often with 
a reinforcer). For example, on the occasion of 
candy in the immediate vicinity, the child 
screams, flails about on the floor, perhaps striking 
his head, until he is given some candy. There 
is evidence that much of the atavistic perform- 
ance of the autistic child is operant, that is, con- 
trolled by its consequence in the social environ- 
ment. The operant nature of the autistic child’s 
atavisms is borne out by experiments where they 
were locked in an experimental space daily for 
over a year. There was no social intervention, 
and the experimental session was usually pro- 
longed if a tantrum was underway. Under these 
conditions, the frequency of tantrums and ata- 
visms declined continuously in the experimental 
room until they all but disappeared. Severe tan- 
trums and attempts at self-destruction still oc- 
curred when sudden changes in the conditions 
of the experiment produced a sudden change in 
the direction of nonreinforcement of the child’s 
performances. Severe changes in the reinforce- 
ment contingencies of the experiment produced 
a much larger reaction in the autistic than in the 
normal child. Consequently, we learned to 
change experimental conditions very slowly, SO 
that the frequency of reinforcement remained 
high at each stage of the experiment. Much of 
the atavistic behavior of the autistic child is 
maintained because of its effect on the listener. 

Reinforcing Stimuli—The reinforcers main- 
taining the autistic child’s performance are diffi- 
cult to determine without explicit experimenta- 
tion. Small changes in the physical environ- 
ment as, for example, direct stimulation of the 
mouth, splashing water, smearing a sticky sub- 
stance on the floor, breaking a toy, or repeated 
tactile sensations, appear 10 sustain the largest 
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part of the autistic child’s repertoire. Neverthe- 
less, these may be weak reinforcing stimuli which 
appear to be strong, because the response pro- 
duces its reinforcement continuously and because 
alternative modes of responding are also main- 
tained by weak reinforcers. The durability and 
effectiveness of a reinforcer can usually be deter- 
mined best by reinforcing the behavior intermit- 
tently, or by providing a strong alternative which 
could interfere with the behavior in question. 
In the controlled experiments with autistic chil- 
dren, most of the consequences we supplied to 
sustain the children’s performance, such as color 
wheels, moving pictures, music, and so forth, 
were very weak reinforcers compared with food 
or candy. Food generally appeared to be an 
effective reinforcer, and most of the perform- 
ances associated with going to the dining room 
and eating are frequently intact. In contrast, the 
normal children could sustain very large amounts 
of behavior through the nonfood reinforcements. 
It is difficult to guess the potential effectiveness 
of new reinforcers, because the estimate depends 
upon some performance being maintained by that 
reinforcer. 

In the everyday activities of the autistic chil- 
dren, little behavior was sustained by conditioned 
or delayed reinforcers. But in a controlled ex- 
perimental situation, such activities could be sus- 
tained by explicit training. For example: (/) 
the sound of the candy dispenser preceding the 
delivery of candy served as a conditioned rein- 
forcer. The fine-grain effects of the schedules 
of reinforcement show this. The difference in 
performance produced by two different schedules 
of reinforcement could have occurred only if the 
effective reinforcer were the sound of the maga- 
zine rather than the delivery of a coin. The 
actual receipt of the coin or food is much too 
delayed to produce the differences in perform- 
ances under the two schedules without the con- 
ditioned reinforcer coming instantly after a re- 
sponse; (2) with further training, the delivery 
of a coin (conditioned reinforcer) sustained the 
ckild’s performances. The coin, in turn, could 
be used to operate the food or nonfood devices 
in the experimental room; (3) still later, coins 
sustained the child’s performance even though 
they had to be held for a period of time before 
they could be cashed in. The child worked until 
he accumulated five coins, then he deposited them 
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in the reinforeing devices; (4) even longer delays 
of reinforcement were arranged by sustaining 
behavior in the experimental room with a condi- 
tioned reinforcer as, for example, a towel or a 
life jacket which could be used later in the swim- 
ming pool or in water play after the experimental 
session terminated. The experimental develop- 
ment of these performances shows that even 
though the usual autistic repertoire is generally 
deficient in performances sustained by condi- 
tioned reinforcement and with delay in reinforce- 
ment, the children are potentially capable of de- 
veloping this kind of control. 

Little of the autistic child’s behavior is likely 
to be maintained by generalized reinforcement, 
that is, reinforcement which is effective in the 
absence of any specific deprivation. A smile or 
parental approval are examples. The coins de- 
livered as reinforcements in the experimental 
room are potentially generalized reinforcers, since 
they make possible several performances under 
the control of many different deprivations. How- 
ever, we do not know whether the coin has ac- 
tually acquired the properties of a generalized 
reinforcer. 

Stimulus Control of Behavior.—It is very diffi- 
cult to determine the stimulus and perceptual 
repertoire of autistic children. When a child re- 
sponds to a complex situation, it is not usually 
clear what aspect of the situation is controlling 
the child’s behavior. In most cases, it is difficult 
to determine to what extent these children can 
respond to speech discriminatively, since the sit- 
uations are usually complex, and many stimuli 
may provide the basis for the simple perform- 
ances. Similarly with visual repertoires. Con- 
trolled experiments showed unequivocally that be- 
havior can come under the control of simple 
stimuli when differential effects of the perform- 
ances were correlated with the different stimuli. 
When a coin was deposited in a lighted coin slot, 
it operated the reinforcing device. Coins de- 
posited in unlighted slots were wasted. The chil- 
dren soon stopped putting coins in the unlighted 
slots. The previously developed stimulus con- 
trol broke down completely when these stimuli 
were placed in a more complicated context, how- 
ever. A new vending machine was installed with 
8 columns, 8 coin lights, and 8 coin slots, so the 
child could choose a preferred kind of candy. 
The slight increase in complexity disrupted the 
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control by the coin light, and it took several 
months and many experimental procedures be- 
fore the stimulus control was re-established. A 
better designed procedure, in view of the mini- 
mal perceptual repertoire of these children, would 
have been a gradual program by which variations 
in the specific dimensions of the coin slot and 
coin light were changed while the reinforcement 
contingency was held constant in respect to the 
essential property. 

In summary, the repertoire of the autistic child 
is an impoverished one. Little is known about 
the perceptual repertoire, but the available evi- 
dence suggests that it is minimal. The absolute 
amount of activity is low, but this deficit is even 
more profound if the specific items of activity are 
evaluated in terms of whether they are maintained 
by significant effects on a social or even nonsocial 
environment. Most of the child’s performances 
are of a simple sort, such as rubbing a spot of 
gum back and forth, softening and twisting a 
crayon, pacing, or flipping a shoelace. Those 
performances in the child’s repertoire having so- 
cial effects frequently do so because of their 
effects on the listener as aversive stimuli. Ata- 
visms and tantrums are frequent. 


THE EMERGENCE OF PERFORMANCE DEFICITS 
DURING THE EARLY DEVELOPMENT OF 
THE AUTISTIC CHILD 


Having characterized the autistic child’s reper- 
toire, the next step is to determine the kinds of 
circumstances in the early life of these children 
which could bring about the behavioral deficits. 
The general plan is to state how the major be- 
havioral processes and classes of variables can 
drastically reduce the frequency of occurrence 
of the various behaviors in the repertoire of any 
organism. Then, the parental environment will 
be examined to determine circumstances under 
which the actual contingencies applied by the 
parental environment to the child’s behavior 
could weaken the child’s performance similarly. 
The datum is the frequency of occurrence of all 
of the acts in the child’s repertoire, and the inde- 
pendent variables are the consequences of these 
acts on the child’s environment, particularly the 
parental environment. All of the terms in such 
a functional analysis are actually or potentially 
directly observable and manipulable. In general, 
the performances in the child’s repertoire will be 
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simultaneously a function of many factors, each 
contributing to changes in the frequency of the 
relevant performances. It is important, there- 
fore, to consider relative changes in frequency 
rather than simple presence or absence of various 
performances. The datum is the frequency of 
occurrence of the behavior. In the same vein, 
singly identifiable factors may be interrelated and 
functioning simultaneously. 

The major paradigm for describing the be- 
havior of an organism is to specify the conse- 
quences of the act (reinforcement) which are 
responsible for its frequency. In this sense, the 
major cause of an instance of behavior is the 
immediate effect on the environment (reinforce- 
ment). The continued emission of the verbal re- 
sponse “Toast” depends on its effect on the 
parent in producing toast. Every known be- 
havioral process influencing the frequency of a 
positively reinforced performance is relevant to 
the problem of defining conditions under which 
we may produce a behavioral deficit. Given the 
variables which maintain it, a performance may 
be weakened by their absence or by changing the 
order of magnitude. It is perhaps surprising to 
discover that large behavioral deficits are plausi- 
ble without any major appeal to punishment or 
suppression of behavior by aversive stimuli. 

Intermittent Reinforcement and Extinction — 
Intermittent reinforcement and extinction are the 
major techniques for removing or weakening be- 
havior in a repertoire. The most fundamental 
way to eliminate a kind of behavior from an or- 
ganism’s repertoire is to discontinue the effect 
the behavior has on the environment (extinction). 
A performance may also be weakened if its main- 
taining effect on the environment occurs intermit- 
tently (intermittent reinforcement). Behaviors 
occurring because of their effects on the parent 
are especially likely to be weakened by intermit- 
tent reinforcement and extinction, because the 
parental reinforcements are a function of other 
variables and behavioral processes usually not 
directly under the control of the child. The re- 
inforcement of the verbal response “Give me the 
book” may go unreinforced because of many 
factors which determine the behavior of the lis- 
tener. He may be preoccupied, listening to some- 
one else, disinclined to reinforce, momentarily 
inattentive, etc. In contrast, the physical en- 
vironment reinforces continuously and reliably. 
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Reaching for a book is usually followed by the 
tactile stimulation from the book. Verbal be- 
havior, particularly, depends entirely for its de- 
velopment and maintenance on reinforcements 
supplied by an audience (usually a parent). Be- 
cause of the possibility of prolonged extinction 
and infrequent, intermittent reinforcement, speech 
and social behavior are the most vulnerable as- 
pects of the child’s repertoire. The young child 
is particularly vulnerable to the extinction and 
intermittent reinforcement occurring in social re- 
inforcement because only the parental environ- 
ment mediates nearly all of the major reinforcers 
relevant to his repertoire. Large parts of the 
child’s repertoire are reinforced by first affecting 
a parent who in turn produces the reinforcer for 
the child. 

Factors in the Parental Repertoire Affecting 
the Frequency of Reinforcement of the Child’s 
Performances. —To find the condition under 
which the child’s repertoire will be weakened, 
therefore, we must look for conditions influenc- 
ing the parents’ behavior, which will alter the 
parental performances, which in turn provide 
reinforcement of the child’s performances. These 
might be: 

1. The general disruption of the parental 
repertoire. Any severe disruption of the parental 
repertoire will severely affect the frequency with 
which the parent reinforces the behavior of the 
child, Consider, for example, the depressed par- 
ent whose general level of behavior is very low. 
One consequence of this low level of behaving 
will be a lessened frequency of reacting to the 
child. Therefore, many items in the child’s 
repertoire will be less frequently reinforced in 
the depressed than the normal parent. The ver- 
bal responses “May I have some bread” or “I 
want to go outside” might go unreinforced or be 
emitted many times without reinforcement. Vari- 
ous kinds of somatic disturbances, such as alco- 
holic “hangover,” drug addiction, severe head- 
ache, somatic diseases, etc., could also produce 
large changes in the overall reactivity of the 
Parent to a child. To the extent that the child’s 
performances occur because of their effect on 
the parent, the severely weakened parental reper- 
toire may correspondingly weaken the child’s be- 
havior. If the parental extinction of the child’s 
behavior is systematic and periodic, much of a 
child’s behavior could be eliminated. 


2. Prepotency of other performances. 
Whether or not a parent reinforces a child’s per- 
formance also depends upon the alternative reper- 
toire available to the parent. For example, the 
parent who is absorbed in various kinds of activi- 
ties such as house cleaning, a home business, so- 
cial activities and clubs, active telephoning, and 
so forth, may at various times allow many usually 
reinforced performances to go unreinforced. In 
general, the likelihood of omitting reinforcement 
would depend upon the strength of the prepotent 
repertoire. As an example of a prepotent reper- 
toire, the housewife absorbed in a telephone con- 
versation will not be inclined to answer a child 
or comply with a request. House cleaning might 
be another repertoire controlling some parents’ 
behavior so strongly that it is prepotent over be- 
havior in respect to the child. In both cases, the 
essential result is the nonreinforcement of the 
child’s behavior in competition with the prepotent 
parental repertoire. Mothers of autistic children 
often appear to have strong repertoires prepotent 
over the child. This may be at least a partial 
reason why mothers of autistic children are so 
often well-educated, verbal, and at least super- 
ficially adequate people. 

3. A third factor producing intermittent rein- 
forcement of the child’s behavior is related to the 
first two factors listed above. If the parent finds 
other reinforcers outside of the home more re- 
warding than dealing with the child, the child be- 
comes an occasion on which the significant ele- 
ments of the parental repertoire cannot be rein- 
forced. A parent changing diapers, or otherwise 
taking care of a child, cannot telephone a friend, 
be out socializing, be on a job, or doing whatever 
the autistic mother finds rewarding. The child 
acquires the properties of a conditioned aversive 
stimulus because it is an occasion which is in- 
compatible with the parents’ normal repertoire. 
This is of course the major method of aversive 
control in human behavior—the discontinuation 
of positive reinforcement. Another basis for es- 
tablishing the child as a conditioned aversive stim- 
ulus to the parent is the emergence of atavisms 
and a large degree of aversive control of the par- 
ent by the child. To the extent that the parent is 
reinforced by escaping from the child because of 
his conditioned aversive properties, the frequency 
of the parental reinforcement of the child’s be- 
havior is further reduced. 

The development of the atavistic behavior in the 
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child by the parent is necessarily a very gradual 
program in which the beginning steps involve 
small magnitudes of behavior such as whining, 
whimpering, and crying. As the parent adapts 
to these or becomes indifferent to them because 
of the prepotence of other kinds of activity, then 
progressively larger orders of magnitude become 
reinforced. The large-magnitude tantrum may 
be approximated or “shaped” by gradual differen- 
tial reinforcement. The parents of one autistic 
child, for example, at one period took turns all 
night standing in the child’s room because one 
step out of the room would immediately produce 
a severe tantrum in the child. When the child 
functions as a conditioned aversive stimulus for 
the parent, the parent is less likely to reinforce 
the child’s behavior positively. This lack of posi- 
tive reinforcement, in turn, emphasizes the atavis- 
tic responses on the child’s part as the major 
mode of affecting the parent. 

The usual limiting factor in preventing exces- 
sive development of tantrums is the emergence 
of self-control on the part of the parent in escap- 
ing from the aversive control by the child rather 
than reinforcing it. Here, again, the repertoire 
of the parent is relevant. The development of 
self-control requires a highly developed repertoire 
which depends for its development on the ulti- 
mate aversive consequences of the child’s control 
of the parent. The child’s control becomes more 
aversive to the parent if it interrupts strong reper- 
toires. Specifically, a parent engrossed in a con- 
versation will find a child’s interruption more 
aversive than a parent who is simply resting. If, 
in fact, there is no strong behavior in the par- 
ent, then the child’s control is not likely to be 
aversive, and there is no basis for developing 
self-control. 

All three of the above factors—overall disturb- 
ances in the: parental repertoire, prepotent activi- 
ties, and escape from the child because of his 
aversiveness—reduce the amount of parental re- 
inforcement of the child’s performances, The 
overall effect of the nonreinforcement on the 
repertoire of the child will depend upon the 
length of time and number of items of the child’s 
repertoire that go unreinforced, as well as the 
existence of other possible social environments 
that can alternatively maintain the child’s be- 
havior (see below). 

The Differential Reinforcement of Atavistic 
Forms of Behavior by the Parent—The schedule 
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by which the parent reacts to the child is also rele- 
vant to the development of atavistic behavior. 
Initially, a tantrum may be an unconditioned con- 
sequence of parental control as, for example, sud- 
den nonreinforcement or punishment. Eventu- 
ally, however, the child’s tantrums may come to 
be maintained by their effect on the parental 
environment, because they present an aversive 
situation that can be terminated if the parent 
supplies some reinforcer to the child. The rein- 
forcer presented by the parent to escape from 
the aversive consequences of the tantrum also in- 
creases the subsequent frequency of atavistic re- 
sponses. 

The effect on the parent of the given form and 
intensity of tantrums will vary from time to time, 
depending on the conditions maintaining the par- 
ents’ behavior. This variation in sensitivity of 
the parent to aversive control by the child results 
in a variable-ratio schedule of reinforcement of 
the child’s tantrum by the parent—a schedule of 
reinforcement potentially capable of maximizing 
the disposition to engage in tantrums. This is 
the schedule of reinforcement that produces the 
high frequencies of performances as in gambling 
(10). The sensitivity of the parent to aversive 
control by the child will depend on the general 
condition of the parental repertoire as discussed 
above. The same factors in the parental reper- 
toire that tend to produce nonreinforcement of 
the child’s behavior—general disruption of the 
parent or other behavior prepotent over the child 
—correspondingly produce reinforcement of large- 
order-of-magnitude tantrums. The parent whose 
total repertoire is severely enough disrupted to 
interfere with the normal reinforcement of the 
child’s behavior will also react only to tantrums 
that are of large-order-of-magnitude of aversive- 
ness. A range of sensitivity of the parent to aver- 
sive control by the child produces ideal conditions 
for progressively increasing the intensity or fre- 
quency of tantrums. A high sensitivity to aver- 
sive control guarantees that some tantrums will be 
reinforced at least periodically. A low sensitivity 
differentially reinforces tantrums of large-orders- 
of-magnitude. At one extreme, the parent may 
be hypersensitive to the child, and at other times, 
so depressed that only physical violence will pro- 
duce a reaction. The schedule by which the par- 
ent’s behavior terminates the tantrum is a second 
factor which will increase the range of reactivity 
of the parent. As more behavior is required of 
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the parent to terminate the tantrum, the parent’s 
inclination to do so will fall. When the parent 
is less inclined to reinforce a given intensity of 
tantrum, any variation in tantrum intensity is tan- 
tamount to differential reinforcement of extreme 
forms, if the parent now reacts to the larger- 
order-of-magnitude tantrum. 

How much the parent differentially reinforces 
tantrums in the child depends, in part, upon the 
child’s other positively reinforced repertoires. 
When, for example, a child’s performance sud- 
denly goes unreinforced, as when a parent may 
refuse a request, the likelihood and severity of a 
tantrum will in part depend on the parent’s ability 
to “distract” the child. This, in turn, depends 
upon whether alternative modes of behavior are 
in fact available to the child. When conditions 
are present for the progressive reinforcement of 
more and more severe tantrums, the process is 
potentially non-self-limiting. Autocatalysis is 
likely to occur, particularly if the parent has lit- 
tle disposition to reinforce the general items in 
the child’s repertoire for reasons other than ter- 
minating the aversive demands of the child. 

Nonsocial Reinforcers—Some of the child’s 
behavior is maintained by his direct effect on the 
physical environment without the intervention of 
other individuals. In general, very small effects 
on the environment will sustain performances with 
which the parent usually has little reason to in- 
terfere. For example, the child plays with his 
own shoelace, moves his fingers in his own visual 
field, emits minimal nonverbal, vocal responses, 
and so forth. Larger effects on the physical en- 
vironment as, for example, moving objects about 
the house, speaking to the parent, playing with 
toys, touching and handling usual household ob- 
jects, are more likely to enter upon the parental 
repertoire, and so may produce a response whose 
effect is to discontinue the behavior or interfere 
with its reinforcement. The punishment aspect 
of the parental interference with the child’s activi- 
ties will be dealt with separately below. The 
relative possibility of parental interference and 
nonreinforcement of the hierarchy of perform- 
ances may account for the large part of the autis- 
tic child’s repertoire, which consists of behaviors 
having small, limited effects on the physical en- 
vironment. Occasionally, even behaviors that are 
maintained by the most simple effects on the en- 
vironment are extinguished or punished when they 
occur in the presence of a parent. For example, 
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the father of one autistic child reports that the 
child reached for a chandelier while he was hold- 
ing him. The father instantly dropped the child, 
with a reaction of considerable disapproval be- 
cause “You should pay attention to me when 
you’re with me.” Aside from the secondary ef- 
fect on the child, the immediate result of the inci- 
dent is the nonreinforcement of the child reach- 
ing for a common physical object. 

The existence of “nonverbal” vocal behavior in 
some autistic children may be related to forms of 
vocal behavior with which the parent will or will 
not interfere. Vocal behavior maintained by its 
effect on a parent (verbal) is susceptible to weak- 
ening by parental extinction. A parent interferes 
less easily with vocal behavior maintained by its 
direct effect (nonverbal) comparable with mak- 
ing noise by rubbing a stick over a rough surface. 
Further, such nonverbal vocal responses can 
emerge readily at any stage of the child’s life, 
unlike verbal behavior, because it does not de- 
pend on a generalized reinforcement. 

Failure to Develop Conditioned and General- 
ized Reinforcers—The normal repertoire of the 
child consists almost entirely of sequences of be- 
havior that are maintained, in a chain or sequence, 
by conditioned and generalized reinforcers (10). 
An example of a chain of responses would be the 
behavior of the child moving a chair across the 
room and using it to climb to a table top to reach 
a key which in turn opens a cupboard containing 
candy. This complicated sequence of behavior 
is linked together by critical stimuli which have 
the dual function of sustaining the behavior they 
follow (conditioned reinforcement) and setting 
the occasion for the subsequent response. The 
chair in the above example is an occasion on 
which climbing onto it will bring the child into a 
position where reaching for food on the table top 
will be reinforced by obtaining food. Once this 
behavior is established, the chair in position in 
front of the table may now be a reinforcer, and 
any of the child’s behavior which results in moy- 
ing the chair into position will be reinforced be- 
cause of the subsequent role of the chair in the 
later chain of behaviors. A minimal amount of 
behavior is necessary before a chain of responses 
can develop. The development of the control by 
the various stimuli in the chain, both as discrim- 
inative stimuli setting the occasion for the rein- 
forcement of behavior and as reinforcers, depends 
upon a high level of activity, so that the responses 
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will occur and come under the control of the 
stimuli. This is even more true for the develop- 
ment of the generalized reinforcer. When the 
child has moved enough objects about the house 
and achieved a variety of effects on his environ- 
ment relative to a range of deprivations and re- 
inforcers, simply manipulating the physical en- 
vironment may become a reinforcer without refer- 
ence to a specific level of deprivation. This, of 
course, is the uniquely human reinforcer that 
makes possible much of verbal behavior, educa- 
tion in general, and self-control. Again, large 
amounts of behavior—many chains of behavior 
with many different kinds of conditioned rein- 
forcers—are a necessary condition for the emer- 
gence of a generalized reinforcer. To the extent 
that the child’s repertoire becomes weakened by 
intermittent reinforcement and extinction, as men- 
tioned above, and punishment and aversive con- 
trol (see below), the possibility of the develop- 
ment of generalized reinforcers, and hence more 
complex behavior, becomes less and less likely. 
Parental “attention” is probably one of the most 
important generalized reinforcers normally main- 
taining the child’s behavior. Parental attention is 
an occasion upon which the child’s performances 
may have an important effect on the parent. In- 
attention is an occasion on which the child’s re- 
sponses are likely to have little effect. Hence, 
the parents’ performances in smiling, saying 
“Right” “Good boy” “Thank you” all come to 
function as conditioned reinforcers. Their emer- 
gence as generalized reinforcers again depends 
upon the existence of a large behavioral reper- 
toire. A large number of chains of responses 
will produce important positive effects when the 
parent smiles or says “Good boy.” Lower fre- 
quencies of reinforcement follow for these same 
activities when the parent is frowning or says 
“Bad boy.” 

Any large reduction in the child’s overall per- 
formance will interfere with the initial develop- 
ment of conditioned reinforcers or their continued 
effectiveness. The control by the environment 
over the child’s behavior depends first upon the 
emission of the behavior. This follows from the 
manner in which the environment comes to con- 
trol the child’s performance: the successful exe- 
cution of an act on one occasion, coupled with 
the unsuccessful act in its absence. Until a child 
climbs on chairs, as in the previous example, a 
chair has little chance of becoming a discrimina- 
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tive stimulus. Without the development of stim- 
ulus control, conditioned reinforcers cannot de- 
velop. The reinforcing effect of the chair in the 
above example depends upon its being the occa- 
sion on which further performances may be rein- 
forced. In this way, a low general level of be- 
havior may impede the enlargement of the child’s 
repertoire because it does not allow stimulus con- 
trol and in turn prevents reinforcement of new 
behavior. A limited development of simple con- 
ditioned reinforcers in turn prevents the develop- 
ment of a generalized reinforcer. Parental re- 
sponses, such as smiling, “Good” “Right” can 
have little effect on the child if there is not a his- 
tory by which on these occasions many different 
forms of the child’s performance have produced 
various reinforcers. Without the parental gener- 
alized reinforcement, educational processes and 
positive parental control are all but impossible. 
This control is normally carried out by the use 
of praise, parental attention, coupled with mild 
forms of threats of discontinuing the reinforcers. 
Even after a generalized reinforcer has acquired 
its function, its continued effectiveness depends 
on the various stimuli continuing to stand in a 
significant relation to the child’s performance. If 
the child has no behavior in his repertoire that 
will be more likely to be reinforced on the occa- 
sion of a parental smile, it matters little what the 
parent’s reinforcing practices are when smiling 
as against when frowning. 


STIMULUS CONTROL 


The specific occasions on which a child’s per- 
formances have their characteristic effects on the 
environment will subsequently determine whether 
the child acts. In the absence of the characteristic 
circumstances under which the behavior is nor- 
mally reinforced, the child will be less disposed 
to act in proportion to the degree of similarity 
with the original situation. Changing a stimulus 
to one which has not been correlated with re- 
inforcement is another way of weakening a reper- 
toire. New stimuli also elicit emotional responses 
and general autonomic effects that may interfere 
with established performances. Here, simply re- 
peated exposure to the stimuli may produce adap- 
tation to the stimuli and eliminate their emo- 
tional effects. Ordinarily, the infants’ perform- 
ances are under the control of a limited range 
of stimuli, usually one or two parents in a lim- 
ited part of a specific home environment. The 
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discriminative repertoire broadens as the child 
grows older and other individuals come to be 
occasions on which his performances have sig- 
nificant effects. The parental environment of the 
very young child narrows the control of the child’s 
performance to a limited range of stimuli, largely 
because the parent mediates almost all of the im- 
portant events affecting the child. A major factor 
which brings the child’s behavior more narrowly 
under the control of the parent is the nonrein- 
forcement of much of the child’s behavior in the 
absence of the parent. The close control of the 
child’s behavior by the parent weakens the child’s 
repertoire in the absence of a parent much more 
when there has been explicit differential reinforce- 
ment than when there has been simply a limited 
reinforcing environment. 

Sudden shifts in the child’s environment may 
or may not produce major performance deficits. 
At one extreme, a sudden shift of the stimuli in 
the child’s controlling environment will have little 
influence if the child already has been reinforced 
on the occasion of a wide range of circumstances 
and individuals. At another extreme, a repertoire 
can be eliminated almost completely if the child 
has had a history in which major kinds of per- 
formances have gone unreinforced except on the 
occasion of a single person in a specific environ- 
ment. The sudden shifts in the situations and 
persons controlling the child’s behavior may occur 
under a variety of circumstances, such as a sudden 
change in a constant companion, death of a par- 
ent, or a sudden shift in the physical environment. 
A sudden shift in the environment of one of the 
subjects reported in the previously mentioned ex- 
periment could conceivably have been the major 
factor in her autistic development. Many of the 
activities of the child’s mother were prepotent 
over dealing with the child, and she solved the 
problem by hiring a teenage baby sitter as a 
constant companion and nursemaid. After a 
year, the baby sitter left, suddenly and abruptly, 
leaving the child with the mother. Within 4 
months, the child began to behave less in general, 
lost speech, and showed increasing frequency of 
atavisms. The child’s repertoire possibly was un- 
der such close control of the baby sitter that the 
very sudden change to the mother created an en- 
vironment which in the past had been correlated 
with nonreinforcement. If the child’s behavior 
were under very narrow control by the baby sitter, 
because of the nonreinforcement on all other occa- 
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sions, a sudden shift, as in the loss of the baby 
sitter, could produce a dramatic deficit in the 
child’s repertoire. 

Disruptive Effect of Sudden Stimulus Changes 
and the Amount, Durability, and Range of Be- 
havior. —A novel reinforcing environment will not 
sustain a child’s performance unless the repertoire 
contains behavior of a sufficient range and dura- 
bility. The new environment weakens the per- 
formance because it nearly always requires slightly 
different forms of behavior. For example, a new 
person entering a child’s home is not so likely to 
respond successfully to the incompletely devel- 
oped verbal behavior of a child as the parent. 
The possibility of the child’s affecting the stranger 
will depend upon his having verbal responses dif- 
ferent from those usually reinforced by the par- 
ent, and, also, durable verbal behavior that will 
continue to be emitted under the intermittent re- 
inforcement that is likely to occur. If the child’s 
repertoire is durable and extensive enough so that 
the verbal response may be repeated several times 
and supplemented by auxiliary behavior, the child 
has a greater chance of affecting the new person, 
or of being shaped by him. Similarly with other 
kinds of social behavior. The wider the range 
of behavior and the greater the disposition to 
emit it, the mere likely that the child’s perform- 
ance will be within the range of responses poten- 
tially reinforcible by the new environment. 

For a stimulus to acquire control over behavior, 
the child must first emit behavior in the presence 
of the stimulus. Consider, for example, the per- 
formance of a child at a children’s party at which 
there are lots of toys and games, such as bicycles, 
swings, and so forth. The likelihood of the child’s 
behavior coming under the control of any of the 
other children as reinforcers is minimal if the new 
environment suppresses or makes the child’s en- 
tire repertoire unavailable because it is a novel 
stimulus and is an occasion on which the child’s 
behavior has never been reinforced. If the be- 
havior of playing with a swing or riding a tri- 
cycle is sufficiently strong that it may be emitted 
even under the adverse conditions of the very 
strange party environment, then the simple emis- 
sion of the previously developed behavior pro- 
vides a situation under which other children at 
the party may potentially reinforce or otherwise 
affect the child’s repertoire. Simply the acts of 
eating cake, candy, or ice cream, or picking up a 
toy put some of the child’s behavior under the 
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control of the new environment. Each new per- 
formance which can potentially occur at the party 
provides a basis for the child’s reinforcing some 
behavior of other children at the party, or of his 
coming under the control of the other children’s 
reinforcers. On the other hand, a sudden expo- 
sure to a new environment with a weak and nar- 
row repertoire may produce a severe behavioral 
deficit. In any case, the child will be much less 
disposed to go to the party if he had behaved un- 
successfully in the new environment. This lower 
disposition to attend and engage in the party would 
in turn make it less likely that the child will emit 
behavior that would be reinforced in the party 
environment. 

Adaptation.—The emotional and elicited auto- 
nomic effects of novel environments may also in- 
terfere with a child’s performances. Adaptation 
to new environments occurs with gradual expo- 
sure. A sudden exposure to a new environment 
will produce gross emotional and autonomic re- 
sponses which will in turn interfere with, or even 
completely suppress, the emission of possible oper- 
ant behavior potentially reinforcible by the new 
environment. The rate at which the child is ex- 
posed to the new environments will determine the 
magnitude of disturbance. Exposure to a new en- 
vironment and adaptation of the emotional re- 
sponses do not necessarily create the potential 
basis for responding, however. A repertoire that 
will make contact with the new environment is 
also necessary. 

The Amount of Prior Nonreinforcement.—The 
more closely controlled the child’s performances 
are by specific stimuli, the more likely a sudden 
shift in the environment will produce a cessation 
of responding. For example, the child receiving 
minimal care from a parent probably will be less 
affected by a sudden shift in environment than a 
child closely affected and controlled by parental 
response. It is paradoxical that the parent who 
responds sensitively to the child’s performances 
may be potentially weakening it more than the 
parent who exerts little control over the child. 
It is the alternate reinforcement and nonreinforce- 
ment that place the child’s behavior narrowly un- 
der the control of very specific stimuli so that it 
is much more vulnerable to sudden changes. The 
range of stimuli in whose presence the child's 
behavior goes unreinforced will determine the 
narrowness of the stimulus control. 


CONTEMPORARY RESEARCH IN PERSONALITY 


CUMULATIVE EFFECTS OF A BEHAVIORAL DEFICIT 


The continuous development of more and more 
complex forms of a child’s behavior is normally 
achieved because the parents and community ap- 
proximate the required performances. At each 
stage of the child’s development, the community 
reinforces the child’s current repertoire even 
though it is more disposed to react to small in- 
crements in the child’s performance in the direc- 
tion of the required complex performances. 
Should any of the above processes produce a 
deficit in performance or an arrest in the devel- 
opment of the child’s performance, further de- 
velopment of a repertoire would depend upon the 
community's relaxing its requirements and rein- 
forcing performances in an older child that it 
normally accepts only from a younger one. Ordi- 
narily, the reinforcing practices of the community 
are based on the chronological age and physical 
development of the child. 

Only between the ages of 1% to 4 years does 
the parent have sufficient control of the child to 
weaken his performance to the degree seen in in- 
fantile autism. This is a critical period in the 
child’s development during which his behavior is 
especially susceptible to extinction, because the 
traditional social pattern in the usual family re- 
stricts the child’s experience to one or two par- 
ents. Before the age of a year and a half, the 
child has few performances with which the par- 
ent will interfere or that have important effects 
on the parent. Much of the infant’s behavior is 
maintained by simple and direct effects on its en- 
vironment. As the child approaches two years, 
the rapid development of a behavioral repertoire, 
particularly social ard verbal behavior, makes 
possible extinction and other forms of weaken- 
ing. The effectiveness of the parental environ- 
ment in weakening the child’s repertoire depends 
upon the availability of concurrent audiences for 
the child’s behavior. In general, the two-year- 
old child is limited to the home and comes into 
increasing contact with other environments as he 
grows older, perhaps reaching a maximum at 
school age. The presence of an older sibling 
might appear to preempt the possibility of a suffi- 
cient degree of isolation to account for an aver- 
sive behavioral deficit. A sibling could provide 
an alternative to the parent as a reinforcing en- 
vironment. The behavioral or functional influ- 
ence of a sibling would depend on the amount and 
nature of interaction between the children. For 
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example, an older child might possibly completely 
avoid the younger one or tend to have the same 
patterns of reaction as the parent. In many 
cases, the older sibling has playmates outside the 
home to the complete exclusion of the younger 
child. The older sibling, in many circumstances, 
punishes as well as extinguishes the younger child 
for any attempted participation in his play. There 
are very few facts as to the exact nature of the 
interactions in most cases. 

The parent as the sole maintainer of the child’s 
behavior is perhaps even more likely when the 
child is raised in a rural or isolated community, 
and perhaps with one of the parents largely ab- 
sent. The above analysis suggests that a survey 
of severely autistic children would, in general, 
show them to be first-born children; or, if other 
siblings were available, they would have provided 
little interaction with the child. It also suggests 
that the child would be raised in a house physically 
or socially isolated from other families or children 
such that there were no alternative social environ- 
ments that could provide reinforcement for the 
child’s behavior. When the child was exposed 
to both parents, it would be expected that both 
parents were consistent in their nonreinforcement 
of the child’s performances. 


AVERSIVE CONTROL AND PUNISHMENT 


It has been possible to describe conditions which 
might produce major behavioral deficits without 
dealing with punishment or aversive control. A 
similar account might present a functional analysis 
of how performance deficits might occur as a 
result of aversive control. Many writers have al- 
ready described some of these factors by extend- 
ing general principles of aversive control to hu- 
man behavior (7, 71, 8). For the purposes of 
the analysis presented in this paper, I would like 
to restrict the discussion of aversive control to 
its relation to positive reinforcement. Much of 
human aversive control is carried out by discon- 
tinuing or withdrawing reinforcement (10, 3). 
For example, a frown or criticism may function 
as an aversive stimulus because these are occa- 
sions on which reinforcements are less likely to 
occur. Even when corporal punishment is given, 
it is not clear as to whether the resulting effect 
on the child’s behavior is due to a slap or to the 
lower inclination of a punishing parent to rein- 
force. Most parents who spank a child will be 
indisposed to act favorably toward the child for 
some period of time subsequently. As a result, 
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one major by-product of frequent punishment 
may be a larger order of interference with the 
child’s normal repertoire along the lines of the 
positive-reinforcements deficits described above. 

The obvious effectiveness of punishment in some 
kinds of human control appears to contradict ex- 
perimental findings with animals which show pun- 
ishment to have only a temporary effect on be- 
havior (9, 2, 1). The role of positive-reinforce- 
ment factors helps resolve the dilemma. The 
effectiveness of punishment depends on how 
strongly the punished behavior is maintained by 
positive reinforcement. The apparent effective- 
ness of punishment in the control of children may 
occur when weak repertoires are punished or when 
the punishment indirectly produces extinction. 
Most animal experiments using electric shock as 
an aversive stimulus have used strongly main- 
tained positively reinforced operant behavior as 
the base-line performance to be punished. The 
aversive control might be more effective when the 
performances to be punished are less strongly 
maintained. As might be expected from the rela- 
tively low frequency of infantile autism, the com- 
bination of circumstances hypothesized above 
would occur only rarely. The above hypothesis 
provides a framework for investigating the cir- 
cumstances surrounding the development of the 
autistic child. All of the variables that might 
weaken the behavior of a child are directly or 
potentially observable. The data required are the 
actual parental and child performances and their 
specific effects on each other, rather than global 
statements such as dependency, hostility, or so- 
cialization. Not all of the factors responsible for 
a child’s performance may be present currently. 
Retrospective accounts would have to be used, 
therefore, with all the difficulties of determining 
the actual correspondence between the verbal be- 
havior of the parent and their situations being 
described. 

The same kind of functional analysis can be 
made for the performance of the adult psychotic. 
Maintaining already-established behavior is more 
at issue than the initial development of a perform- 
ance in the case of the adult, however (3). 
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One of the currently popular views of the etiol- 
ogy of severe personality disruption holds that the 
seeds of mental illness are to be found in the life 
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experiences of the individual. In particular, the 
childhood is conceived as a critical period, and 
certain areas, such as parent-child relationships 
and psychosexual development, are viewed as cru- 
cial determiners of or prodromal to later adjust- 
ment or psychopathology. Three possible forms 
of “historical” or biographic etiology may be dis- 
tinguished: (a) the traumatic incident (e.g., wit- 
nessing in childhood the “primal scene”), (b) 
sequential traumata (i.e., a concatenation of emo- 
tional blows resulting finally in disintegration of 
the ego), and (c) acquired predisposition (i.e., 
the learning of a pattern of maladaptive response). 
Writing in 1893, Breuer and Freud (2, p. 6) 
noted that: 


The causal relation between the determining 
psychical trauma and the hysterical phenome- 
non is not of a kind implying that the trauma 
merely acts like an “agent provocateur” in re- 
leasing the symptom, which thereafter leads a 
separate existence. We must presume rather 
that the psychical trauma—or more precisely 
the memory of the trauma—acts like a foreign 
body which long after its entry must continue 
to be regarded as an agent that is still at work. 


Freud (2, p. 173), in describing the case of Eliza- 
beth von R., observes: 


Almost invariably when I have observed the 
determinants of such conditions what I have 
come upon has not been a single traumatic 
cause but a group of similar ones. . . . In some 
of these instances it could be established that 
the symptom in question had already appeared 
for a short time after the first trauma and had 
then passed off, till it was brought on again and 
stabilized by a succeeding trauma. 


Finally, we have a succinct expression of the learn- 
ing hypothesis in Shaffer and Shoben (7, p. 172): 


An evident conclusion is that our distinctive 
attributes, even our most fundamentally human 
qualities, are products of our experience with 
other human beings. Personality (underlining 
ours) is learned as a result of the events in one’s 
history. 

The significance of the once-occurring trauma 
in production of later symptomatology is given 
less prominence now than it held during the hey- 
day of “la grande hysterie” and when public- 
compulsive symptoms were viewed as more cir- 
cumscribed phenomena. Likewise, a simple ad- 
ditive notion about accumulations of emotional 
“shocks” is retained today chiefly by the laity, 
who express this notion in the “straw and camel's 


DEVIANT BEHAVIOR AND ITS TREATMENT 


back” allegory. It is currently more common to 
view the patient’s history as either having de- 
prived him of an opportunity to learn normal 
patterns of socialization or as “overtraining” him 
in some repertoire of abnormal responses. If 
varying patterns of psychopathology are differen- 
tiated, and if the biography, or certain ages or 
elements in the biography, are believed to have 
pathogenic potential, it is reasonable to assume 
that differential patterns of personal history might 
be found which are demonstrably associated with 
various behavioral syndromes. Research in this 
realm immediately involves the investigator in dif- 
ficult methodological problems with respect to 
definition and recording of the life history. 

What is a life history? As the concept is widely 
and loosely used in much psychological writing, 
it is implicitly a theoretical abstraction. It may 
be conceived as inclusive of all events in the total 
sequence of time-space displacements of the in- 
dividual from the moment of his birth up to some 
time at which a summary is prepared. It may 
be conceived as the complete sequence of experi- 
ences had by the individual over such a period. 
It may be thought of as a “complete” collection 
of the events and experiences of the individual. 
Unless one wishes to assume perfect recording 
(and retention?) characteristics for the nervous 
apparatus, it is unlikely that the individual’s popu- 
lation of events, as observable time-space disposi- 
tions, has isomorphic representation in his popu- 
lation of experiences or subjective events. 

Without much formal attention to the theory 
of the life history, most of the research into bio- 
graphical factors in mental illness has been con- 
tent to use the framework or part of the structure 
of the so-called “psychiatric history.” This is 
essentially a relatively uniform selection of sig- 
nificant events and experiences in the histories of 
research subjects. What determines the selection 
and what defines significance? Selection is de- 
termined in part by what data are obtainable as 
a function of accuracy of records and integrity 
of memory. Significance is expressed through 
clinical judgment, largely as a result of observa- 
tions of apparent association between certain fac- 
tors and specific outcomes (e.g., broken homes 
and delinquency). Significance may be also de- 
termined on the grounds of currently (and locally) 
popular theory. Recognizing these constraints 
upon the adequacy of the typical psychiatric his- 
tory as a representative sampling from the theo- 
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retical population of events and experiences con- 
stituting the true life history, it is incumbent upon 
the investigator into the contribution of life his- 
tory factors to psychogenesis of emotional dis- 
turbance to seek at least minimal insurance 
against these sources of error. Such safeguards 
are afforded in the utilization of history outlines 
and recording forms and in the collection of data 
for control samples. The development and ap- 
plication of detailed history schedules helps to 
assure uniformity and thoroughness of coverage. 
The study of control groups serves as a check 
on the associations and hypotheses of etiological 
factors suggested by the study of exclusively 
pathological samples. 

The obviousness of these experimental caveats 
encourages an expectation that they have been 
well respected in researches into the life history 
as a source of psychopathology. A search of the 
literature reveals over 300 studies of life his- 
tories of psychiatric patients. Fewer than ten 
of these included data on a reasonably compar- 
able control group (1, 5, 6, 9). The bulk of the 
studies report simply the frequency of such items 
as patient’s age at death of a parent as recorded 
in mental hospital records. Frequently, only 
selected items are tabulated without any attempt 
at a comprehensive history. When more exten- 
sive coverage of history data has been undertaken, 
a variety of procedures for collection has been 
used and efforts to study a control sample have 
been rare. When studied, control groups have 
seemed mostly to be determined by accessibility 
rather than appropriateness (3). 


PURPOSE AND PREMISES 


The purpose of this study was (a) to deter- 
mine what the life histories of “normal” persons 
look like if examined through the spectroscope of 
a comprehensive psychiatric history interview ‘as 
conducted and recorded by a skilled clinician and 
(b) to determine in what ways such histories may 
be distinguishable from those of such psychiatric 
patients as schizophrenics. 

The basic hypothesis of this research simply 
stated was: the life histories of “normal,” i.e., 
nonpsychiatric patients, will be readily and clearly 
differentiated from those of psychiatric patients 
if equally complete and carefully collected his- 
tory data are available for both. More specifi- 
cally, the hypothesis is that the histories of nor- 
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mals will reveal markedly less occurrence of those 
events or experiences (trauma, deprivations, frus- 
trations, conflicts, etc.) which have been com- 
monly considered to have a psychogenic or pro- 
dromal role in schizophrenia. This is a formal 
statement of the assumption which is made, usu- 
ally implicitly, in those clinical and uncontrolled 
studies which derive from the absolute frequency 
of given events in the histories of certain patients 
the conclusion that the event has causal import. 
In the absence of comparative data on the fre- 
quency of these same events in appropriate con- 
trol subjects, there is not only room but need to 
be doubtful of such conclusions. 


SAMPLE AND PROCEDURE 


The selection of the “normal” subjects (Ss) for 
this study was determined primarily by the na- 
ture of the 178 schizophrenics with whom com- 
parison was to be made. These patients were 
hospitalized at the University of Minnesota Hos- 
pitals, and comprehensive personal history statis- 
tics have been previously reported (8) for them.? 
To assure comparability of the normal and psy- 
chiatric samples, the former were drawn pri- 
marily from the same population which yielded 
the latter, i.e., the general group of persons re- 
ferred to the University hospitals for diagnosis 
and treatment. Of the total, 105, or 70% were 
University patients. An additional 14 cases were 
obtained from Minneapolis General Hospital. 
Finally, 31 physically and psychiatrically nega- 
tive cases were obtained from sources including 
hospital employees, students, employees of a 
large industrial firm, and office workers. The 
term “normal” as applied to this sample specifies 
the absence of psychiatric disturbances under 
treatment at the time the histories were recorded 
and the absence of any previous mental disorder. 
The nonpsychiatric patients were drawn from all 
but the psychiatric wards and psychiatric clinics 
of both the University hospitals and Minneapolis 
General Hospital. The medical diagnoses repre- 
sented covered a wide range, and no type of ill- 
ness or defect was predominant. 

To assure further comparability of the two 
groups, selection of the normals was made so as 


2]t is recommended that this report be read for 
further detail on the history schedule and method of 
data collection for the schizophrenic sample. 
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to achieve matching with the schizophrenics for 
age, sex, and marital status. Success of the 
matching is indicated in Table 1. As a further 


TABLE 1 


Gross DESCRIPTIVE DATA FOR NORMALS 
AND SCHIZOPHRENICS 


Normal Schizophrenic 
(N = 150) (N = 178) 
Age M = 28.6 yrs. | M = 28.7 yrs 
SD= 9.7 SD= 9.6 
Sex 
Male 44.7% 44.4%, 
Female 55.3 55.6 
Marital status 
Married 33.3% 26.9% 
Single 64.7 69.1 
Divorced 1.3 1.7 
Widowed 0.6 17 
Separated 0.0 0.6 
Education M = 12.0 yrs.| M = 9.9 yrs. 
SD = 3.4 SD = 2.5 
Home 
Urban 44.7% 50.8% 
Rural 55.3 49.2 
Religion 
Lutheran 38.0% 35.7% 
Other Prot. 31.7 31.0 
Catholic 24.0 27.0 
Jewish 0.6 6.2 
Moslem 0.6 0.0 
No. of siblings M=41 M=42 
SD = 2.8 SD = 3.4 
No. of children | (N = 53) (N = 50) 
M=24 M = 2.8 
SD = 2.1 SD = 2.4 


reflection of the comparability of the two samples, 
data on education, rural-urban origin, and re- 
ligious affiliation are recorded in Table 1. The 
mean number of years of formal education for 
the normals is reliably higher than that of the 
schizophrenics; however, the overlap between the 
two distributions is approximately 50%. Also, 
the higher average educational level of the nor- 
mals undoubtedly reflects the general increment 
in average educational level of the public, since 
they were hospitalized in 1956-1957 (of school 
age in 1904-1956), in contrast to the schizo- 
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phrenics who were hospitalized between 1938- 
1944 (of school age in 1886-1944). No basic 
difference is indicated in the intellectual or gen- 
eral socioeconomic character of the two samples. 
The comparability of the two samples is further 
supported by the data on number of sibs and 
number of children in the married Ss as reported 
in Table 1. 

The life histories of the normals were collected 
through the medium of a comprehensive clinical 
interview requiring from 45 to 90 minutes. Only 
two of the patients who were approached refused 
to cooperate. Satisfactory rapport was achieved 
in most cases, and the Ss appeared to give re- 
liable accounts of their backgrounds. Many of 
them were interested in the nature of the study, 
and this was briefly discussed with them at the 
close of the interview. Also, at the end of the 
interview each S was administered an MMPI as 
a further check on his psychiatric status. 

Immediately after each interview, the inter- 
viewer transcribed her notes and further obser- 
vations onto a detailed schedule covering over 
100 distinct items pertaining to developmental, 
personal, social, and medical history. This was 
the same schedule which had been used in re- 
cording the history data for the schizophrenic 
sample.‘ 


RESULTS 


The major findings of the study are reported 
in Tables 2-7, which show the percentage fre- 
quency of occurrence in the two samples of vari- 
ous psychological relationships and adjustment 
variables. These range from the quality of the 
relationship between the Ss parents to the degree 
of manifestation of a life plan and initiative in 
the pursuit of that plan. The reliability of the 


3 All interviews were conducted by the junior au- 
thor, who at the time was a senior medical student. 

4Copies of the 12-page rating schedule for case 
history analysis and of the 4-page record form used 
with it have been deposited with the American Docu- 
mentation Institute. Order Document No. 4209 from 
ADI, Auxiliary Publications Project, Photo-duplica- 
tion Service, Library of Congress, Washington 25, 
D. C., remitting in advance $1.25 for microfilm or 
$1.25 for 6 X 8 inch photocopies. Make checks pay- 
able to Chief, Photoduplication Service, Library of 
Congress. 

This schedule was originally developed by Dahl- 
strom (4) in connection with an unpublished doctoral 
dissertation. 
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differences between the distributions of the two 
samples are reported in the tables.* Generally 
more notable than the presence or absence of 
statistical reliability of the differences are the 
marked overlaps in the distributions for the 
schizophrenic and normal samples. 

Table 2 reveals that the relationship between 
the parents of both the normal Ss and schizo- 
phrenic patients was predominantly one of affec- 
tion. Relationships characterized as ambivalent, 
indifferent, or hostile were slightly more frequent 
between the parents of the normals. In their 
relationships with their fathers, the schizophrenics 
experienced a reliably higher frequency of un- 
favorable attitudes. However, two-thirds of these 
patients apparently received affection from their 
fathers, and less than one-fourth were the re- 
cipients of either rejection or domination. The 
maternal relationship more clearly differentiates 
the two samples. Again, however, it is to be 
noted that nearly two-thirds of the schizophrenics 
enjoyed the affection of their mothers. Of the 
various undersirable relationships which were re- 
corded, overprotection and domination were most 
prominent in the schizophrenics. This is slight 
support for the current belief of some experts in 
the existence of the so-called “schizophrenogenic 
mother”: 


Psychoanalytic students are uniform in the 
opinion that maternal rejection and domination 
are regularly found in the histories of those 
who are found later to have been predisposed 
to the development of schizophrenia. The 
mother of the schizophrenic is variously de- 
scribed as cold, dominating, narcissistic, lack- 
ing love for the child, having death wishes to- 
ward it... . (ZI) 


However, domination or overprotection charac- 
terized the relationship of less than one-fourth of 
the mothers of the schizophrenics in the present 
study. 

Table 2 also reports the relationships between 
the two samples and their respective sibs and the 
frequency of six factors of physical and/or psy- 
chological deprivation or trauma. The intersib 
relationships do not differentiate the two groups. 
While none of the home conditions noted oc- 


5 The chi square test was used to determine proba- 
bility that the obtained distributions belong to a com- 
mon population. Where indicated, Yates’ correction 
was applied. 
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TABLE 2 


INTERPERSONAL RELATIONSHIPS, HOME FACTORS, 
AND SCHOOL ADJUSTMENT IN THE EARLY 
HISTORIES OF NORMALS AND SCHIZOPHRENICS 


Schizo- 2 
Normal phrenic z P 
Interparental relationship | (N = 144) | (V = 101) 
Affection 15.1% 76.2% 0.14 | >0.98 
Ambivalence 6.3 5.9 
Indifference 0.7 19 
Hostility 17.4 15.8 
Paternal relationship (N = 143) | (N = 127) 
Affection 76.2% 65.3% 13.02 | <.05 
Ambivalence 9.8 5.5 
Indifference 4.2 3.1 
Rejection 7.7 11.8 3.91 | <.05 
Overprotection 0.7 1.5 
Domination 1.4 11.8 
Maternal relationship (N = 150) | (N = 128) 
Affection 81.3% 64.8% 32.90 | <.01 
Ambivalence 8.0 2.3 
Bee 2.7 1.5 
ejection 6.0 6.2 
Overprotection 0.7 13,2 eo pce 
Domination 0.7 10.9 
Neglect, 0.7 0.7 
Sibling relationship (N = 178) | (N = 126) 
None 3.4% 3.9% 4.43 | <.50 
Affection 70.8 76.1 
Indifference 2.2 4.7 
Rivalry 16,3 13.4 
Domination 2.8 0.7 
Submission 45 0.7 
Home conditions 
Poverty 20.7% 9.0% 8.00 | <.01 
Alcoholism 6.0 6.8 0.19 | <.70 
Invalid 12.7 0.6 18.67 | <.01 
Divorce 6.0 1.6 3.14 | <.10 
Separation 2.7 1.1 0.38 | <.70 
Death of parent 14.7 10.7 0.82 | <.50 
School acceptance (N = 155) | (N = 159) 
Marked hostility 71% 3.8% 13.64 | <.01 
Mild dislike 17.4 11.3 
Indifference 11.0 25.8 
Agreeable 54,8 54.7 
Keen enjoyment 10.0 44 
School achievement (N = 150) | (N = 163) 
Repeated failure 7.3% 4.9% 21.49 | <.01 
Work difficult 6.0 19.6 
Average performance 42.0 50.2 
Easily earned good 
grades 38.7 19.6 
Accelerated 6.0 44 
School deportment 
Poor record 9.3% 79% 22.92 | <.01 
Usual no. of escapades 36.0 10.6 
Excellent record 54.7 81.6 


Note: In this and subsequent tables, where two x? values appear 
for a given factor, one is based on all categories and the second is for 
comparison of bracketed versus unbracketed categories. 
curred in more than a fifth of the homes of either 
sample, two of the factors, namely, poverty and 
invalidism, did have a reliably different rate in 
the two groups, and the rate of divorce ap- 
proached a reliable difference. All three of these 
more frequently characterized the childhood 
homes of the normals. 

The attitudes, adjustment, and achievement of 
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the two groups in their school experiences are 
recorded in Table 2. All three factors show 
reliably different distributions for the two sam- 
ples. General attitude toward the school situa- 
tion was less different for the normals and schizo- 
phrenics than were achievement and deportment. 
Three times as many schizophrenics as normals 
found their school work difficult, and twice as 
many normals as schizophrenics easily earned 
good grades. The passivity of the prepsychotic 
schizophrenic in the school room, which has been 
commonly observed by clinicians, is suggested in 
their clear preponderance of excellent deport- 
ment records. While nearly one-fourth of the 
schizophrenics experienced failure or difficulty 
with their school work, less than one-tenth of 
them had poor deportment. By contrast, while 
better than 85% of the normals had satisfactory 
or superior achievement, only half of them had 
excellent deportment records. 

Occupational success and satisfaction distin- 
guishes the normals from the schizophrenics as 
revealed in Table 3. The proportion of the two 
groups without any history of occupation is not 
reliably different. While poor occupational 
achievement was recorded for none of the nor- 
mals with an occupational history, this rating was 
assigned to one-fifth of the schizophrenics. Over- 
lap is again notable, with 85% of the schizo- 
phrenics manifesting average or good occupa- 
tional success. Satisfaction with their occupa- 
tions also differentiated the two groups. Over 
half of the schizophrenics disliked or were in- 
different to their work, while three-fourths of 
the normals apparently enjoyed their occupations. 

Table 3 indicates that the two groups were not 
differentiated with regard to frequency of church 
attendance, although the role of religion or atti- 
tude toward it was different in the two. While 
there was no difference in the frequency with 
which religion afforded as dominant source of 
balance to the lives of the two groups, an intel- 
lectualized or ritualistic approach to religion was 
found four times as frequently among the nor- 
mals as among the schizophrenics. Table 3 also 
reports the occurrence of delinquency and crimi- 
nal records; the rates are very small and not dif- 
ferent for the two samples. 

Table 4 records the dating history and marital 
adjustment of the two samples. Frequency of 
dating is not reliably different in the two groups 
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TABLE 3 


OCCUPATIONAL HISTORIES AND RELIGIOUS 
ORIENTATION OF NORMALS AND 


SCHIZOPHRENICS 
Schizo- 2 
Normal phrenic x p 

Occupation (N = 150) | (N = 177) 

None 15.3% 21.1% 0.72 | <.50 
Occupational success (N = 127)| (WV = ae 

Poor 0.0% 14.9% 

Average 38.6 31.5 } oe 

Good 61.4 53.5 18.66 | <.01 
Occupational satisfaction | (N = 127) | (N = 126) 

Dislike 79% 9.5% 21.74 | <.01 

Indifference 17.3 43.6 

Enjoyment 74.8 46.8 
Church attendance (N = 150) | (N = 147) 

Very infrequent 22.0% 38,3% 4.77 | <.10 

Occasionally 36,7 31.2 

Steady 41.3 35.3 
Religiousness (N = 142) | (N = 121) 

Intellectualized, 

ritualistic 18.3% 41% 12.31 | <.01 
Occasional solace 42.3 55.4 
Dominant source of 
balance 39.4 40.5 
TABLE 4 


DATING History, MARITAL AND HETEROSEXUAL 
ADJUSTMENTS OF NORMALS AND SCHIZOPHRENICS 


Normal | Schizo- | „» | p 
phrenic 
Dating (N = 150) | (N = 145) 
None or little 49.3% 57.2% 5.98 | <.10 
Average 46.0 42.8 
Very popular 4.7 oo} 20 
Marital adjustment (N = 53) | (N = 50) 
Extreme frustration 18.9% 16.0% | 6.73 | <.20 
Continual conflict 15.1 14.0 
Compatibility 17.0 42.0 
Pleasure 41.5 24.0 
Chief pleasure 7.5 4.0 
Affection to mate (N = 53) | (N = 51) 
62.3% 64.7% 
Heterosexual adjustment | (N = 150) | (N = 171) 
Poor 36.7% 22.2% 10.49 | <.01 
Fair 20.7 33.3 
Good 42.7 44.4 
Adequacy of outlet (N = 150) | (N = 170) 
Poor 38.6% 24.7% 11.18 | <.01 
Fair 33.3 30.5 
Good 28.0 44.7 


if the “average” and “very popular” categories 
are combined for contrast with the “none or little” 
category. Although there were no schizophrenics 
rated as “very popular” in terms of frequency of 
dating, less than 5% of the normals fell into this 
category. Approximately a third of both sam- 
ples were married (Table 1). The marital ad- 
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justment of the two groups was not different, and 
a third of both groups experienced frustration or 
conflict in their marriages. Although the nor- 
mals tended to derive distinct pleasure from their 
marriages with somewhat higher frequency than 
the schizophrenics, a third of both groups evinced 
attitudes of affection toward their spouses. 

The quality of sexual adjustment and adequacy 
of outlet are shown in Table 4. Sexual adjust- 
ment is reliably different in the two samples. 
Surprisingly, the difference appears primarily in 
the greater frequency of poor sexual adjustment 
in the normal Ss. Likewise, the schizophrenics 
were rated twice as frequently as the normals as 
enjoying “good sexual outlets.” 

The quality of social adjustment and of factors 
affecting interpersonal relationships are shown in 
Table 5. The adequacy of social adjustment is 


TABLE 5 


SocIAL SKILL AND ADJUSTMENT OF SCHIZOPHRENICS 
AND NORMALS 


Normal | Schizo- | „| p 
phrenic 
Social adjustment (N = 150) | (N = 85) 
Withdrawal 10.0% | 61.2% | 78.60 | <.o1 
Ambivalence 7.3 14.1 
Membership 82.7 24.7 
Social intelligence (N = 150) | (N = 167) 
Inept and clumsy 8.1% 8.9% | 29.66 | <.01 
Moderate skill 68.7 88.8 
Adept 22.7 2.3 
Poise W = 150) | (V = 170) 
Retiring and sensitive | 22.7% | 37.4% | 12.99 | <.oı 
Fairly articulate 51.3 49.7 
Confident 26.0 12.8 
Recreation (N = 150) | (X = 163) 
Solitary 27.3% | 40.5% | 651 | <.os 
Mixed 66.0 52.2 
Social 6.7 7.3 


clearly different for the two samples, with the 
schizophrenics showing a high frequency of with- 
drawal and a low rate of active group member- 
ship. Likewise, both the variables of social in- 
telligence and poise show reliably inferior dis- 
tributions for the schizophrenics. The overlaps 
between the two groups are considerable, how- 
ever. Nearly 90% of the schizophrenics were 
rated as having at least moderate skill in inter- 
personal relations, and only a third of them were 
characterized as retiring and sensitive. Pursuit 
of solitary recreation was more characteristic of 
the schizophrenics than the normals, but over 
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half of the former had a history of mixed or 
social recreation. 

In Table 6 are reported five areas of early per- 
sonal attitude and expression which might be 


TABLE 6 


INTERESTS, AsPIRATIONS, AND INITIATIVE 
OF NORMALS AND SCHIZOPHRENICS 


Schizo- 2 
Normal | Phrenic x P 
Breadth of interest (N = 150) | (N = 173) 
Narrow 20.7% 35.2% | 44.30 | <.01 
Some outside 46.0 60.2 
Broad 33.3 4.6 
Level of aspiration (N = 150) | (N = 170) 
Limited 16.0% 34.7% | 17.79 | <.01 
Interested in 
improving 61.3 55.3 
Hig 22.7 10.0 
Life plan (N = 150) | (N = 164) 
Vague 22.1% 46.4% | 56.63 | <.01 
Confused 9.3 28.0 
Clear 68.0 25.6 
Stability (N = 150) | (N = 172) 
Constant fluctuation 10.7% 13.9% | 47.17 | <.01 
Moderate variability 78.7 42.7 
Stolid 10.7 43.4 
Initiative (N = 150) | (N = 169) 
~Apathetic 8.0% 16.0% 4.12 | <.20 
Appropriate 57.3 53.2 
Energetic 34.7 30.8 


broadly classed as manifestations of individual 
perspective and morale. Four of these variables 
show reliably different distributions in the two 
groups. Summarizing for these variables, it may 
be said that the schizophrenics less frequently had 
broad interests, a high level of aspiration, and a 
clear life plan; more frequently than the normals 
they were characterized by a stolid, nonvarying 
temperament and by absence of initiative. 


DISCUSSION 


The single most impressive feature of the data 
presented in Tables 2-6 is the sizable overlap of 
the normal Ss and schizophrenic patients in the 
distributions of the various personal history vari- 
ables. Of the 35 separate tests which were run, 
13 (or 37%) failed to reveal a reliable difference 
between the two samples. Further, on 5 of the 
remaining 22 variables, the distributions showed 
a reliably greater presence in the normals of 
negative or undesirable conditions, In those in- 
stances where the statistical tests did indicate a 
reliable characterization of the schizophrenics by 
prevalence of a pathogenic variable, the normals 
generally also showed a closely approximating 


degree of the same factor. Before discussing the 
implications of the findings, it will be well to 
review facets of the data collection and recording 
which might have had biasing effects. 

The history data for the schizophrenics were 
abstracted from the material routinely collected 
and recorded clinically in the hospital charts of 
these patients as they were admitted, diagnosed, 
treated, and discharged. ‘A variety of persons, 
including social workers, junior medical students, 
psychiatric residents, and staff psychiatrists, con- 
tributed to the recording of these data. No single 
person was charged with the collection of com- 
prehensive history nor was a detailed research 
schedule appiied as a reference in collecting the 
material. These facts suggest the possible under- 
estimation of the actual frequency of certain 
variables in the histories of the schizophrenics. 
However, in abstracting the clinical material and 
recording it on the research schedule, no attempt 
was made to force the rating of a variable when 
clear information was not available; free use 
was made of an “unknown” category. This tac- 
tic has the effect of enhancing the reliability of 
the frequencies reported at the expense of having 
a varying size of sample (for example, “paternal 
relationships” was rated for only 127 of the 178 
schizophrenics). 

Reliability of the ratings of the various factors 
is an important consideration. This is especially 
true with respect to the data on the schizophrenics 
for whom the original clinical records were not 
uniform and from which, without any other 
source of information or contact with the patient, 
the rater had to abstract the material pertinent to 
a given variable and then assign it a rating. The 
definitions of the various scales were made as 
objective and nonambiguous as possible, and the 
number of steps to each scale was generally small. 
As reported previously, independent abstractors- 
and-raters achieved an 80-95% agreement over 
a small sample of trial cases (8). 

The reliability of the data for the normals was 
enhanced by providing for an immediate record- 
ing and rating of information which had been 
collected in an extended interview conducted with 
the research schedule as an implicit guide to in- 
Sure coverage. Failure to find distinguishing 
features for the two groups might be the result 
of a “contamination effect” if the same person or 
persons had been responsible for the study and 
rating of both the schizophrenics and normals. 
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Actually, two researchers working quite inde- 
pendently and at different periods of time col- 
lected and rated the schizophrenic and normal 
material respectively.® This avoids an artificial 
overlap in the distributions as a function of a 
common rater projecting implicit “base rate” 
standards from one sample to another. There 
remains the question of interrater reliability and 
the possibility of stable but different interpreta- 
tions of the criteria for various ratings leading 
to Type I errors (10). This possibility was par- 
ticularly suggested, for example, by the surprising 
distributions of the “heterosexual” adjustment 
variable (see Table 4). As a check on the pos- 
sibility that relatively different criteria had been 
applied by the two raters (both single females) 
in assessing this variable, they were asked to give 
independent, free accounts of their respective in- 
terpretations of the steps on this scale and of the 
criteria they utilized in assigning the cases. They 
appeared to be in essential agreement in these 
respects. In short, within the limits of the in- 
equalities imposed by the differences in the nature 
of the raw data for the two samples and the 
methods by which the basic history material was 
obtained, it would seem that the ratings of the 
various history factors were reasonably reliable 
and no obviously biasing and unbalanced factor 
operated which would serve to either exaggerate 
or diminish true differences between the two 
groups. 

Consideration must also be given to the possi- 
bility that such differences as were obtained be- 
tween the two samples might be a function of 
the different periods of time from which they 
were sampled. As pointed out in discussing the 
difference in mean educational level of the two 
groups, the schizophrenic patients were hospital- 
ized between 1938-1944. Approximately 80% 
of these patients were between the ages of 20-50 
when hospitalized. Using these ages as a refer- 
ence point for the sample and defining childhood 
and early adolescence as encompassing the first 
15 years of life, this period occurred for the bulk 
of the schizophrenics between 1890-1940. By 
Contrast, the normals (with the same age distri- 
butions as the schizophrenics) were evaluated in 
1956-1957. For these Ss, the period of child- 


ê The schizophrenic records were rated by Miss Bell 
in 195 1-52, and the normal Ss were recorded by Lucy 
Balian in 1956-57, 
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hood and early adolescence would fall predomi- 
nantly in the years 1910-1950. It would be 
difficult to ascertain the degree to which these 
two periods would be characterized by distinctive 
patterns of parental attitude, child-rearing prac- 
tices, major social upheaval, and other potential 
sources of psychological eflects in the early life 
histories of individuals. Such contrast could be 
best drawn for the earliest period represented in 
the schizophrenics (1890-1910) and the most 
recent period sampled in the normals (1940- 
1950), but these periods would account for the 
“formative years” of only a small portion of each 
of the groups. The overlap in childhood years 
for the two samples is great, and such significant 
factors as the impact of Freudian psychology, 
the first World War, and the Great Depression 
occurred in the time interval common to both. 
It seems unlikely that broad differences in the 
sociopsychological cultures from which the schiz- 
ophrenics and normals were drawn can be used 
to account for their respective distributions on 
the personal history variables analyzed in this 
study. In any event, if significant social and cul- 
tural factors did indeed differentiate the periods 
1890-1910 and 1940-1950, and such factors had 
causal potency for personality development, they 
should contribute to more and larger differences 
between the schizophrenics and normals. The 
restricted number of differences obtained and the 
impressive amount of overlap between the two 
groups suggests limited existence and/or potency 
of differential sociological factors. 

Finally, the lack of more extensive and clear- 
cut differences between the backgrounds of the 
two samples might be attributed to the fact that 
they did not actually represent distinct popula- 
tions of the psychiatrically ill and the psychiatri- 
cally negative. No quarrel can be made with the 
diagnoses of the schizophrenics. They clearly 
suffered psychotic disturbance of sufficient mag- 
nitude to necessitate hospitalization. Further- 
more, they received their specific diagnoses at a 
time when schizophrenia was not being used as 
a synonym for all psychosis without obvious brain 
pathology. 

Some clinicians undoubtedly would take ex- 
ception to the “normality” of our control group. 
It should be iterated that they denied any history 
of mental illness or psychiatric consultation, and 
they manifested no evidence of gross emotional 
disturbance at the time they were interviewed in 
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spite of the fact that they were currently under 
study or treatment for serious physical illnesses. 
The MMPIs administered as an objective check 
on psychiatric status supported the clinical im- 
pression of essential normality. As shown in 
Table 7, the mean scores on this psychiatric 
screening test were well within normal limits. A 
further analysis of the quality of the normal group 
was made by identifying those Ss whose histories 
included one or more of the events or experiences 
which are generally regarded as psychic trauma. 
The MMPIs of this group of 37 persons (nearly 
one-fourth of the normal sample) were compared 
with those of the remaining “nontraumatized” 
sample., Mean profiles of the two subgroups are 
reported in Table 7. Only the K and D scales 
distinguish these cases, The “traumatized” group 
had a lower mean K score (t = 4.35; p < .01) 
and a higher mean D score (t= 3.75; p < .01). 
These observations suggest a less defensive and 
somewhat more depressive orientation in the Ss 
with the traumatic histories. The lack of more 
extensive differentiation of these two subgroups 
throws a further doubt on the hypothesis that early 
trauma per se are significant predisposing factors 
in later mental illness. 

The data of this study seem to cast serious 
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doubt on the etiological significance of certain 
early life factors for which such imp-rt has been 
frequently claimed. These factors may in fact 
play a causal role in the development of personal- 
ity disturbance, but not as solitary pathogenic 
elements. It would appear that it is the pattern- 
ing or chaining of experiences rather than occur- 
rence or absence which must be examined. While 
the notion of multiple causation is well estab- 
lished, it is more frequently stated in the context 
of types of etiological agent—the physical and 
the psychological—rather than in terms of multi- 
plicity within a given area—personal relationships, 

The surprising frequency with which certain 
forms of pathogenic experiences or circumstances 
were found in the life histories of the normal Ss 
suggests the need to think in terms of “suppressor” 
experiences or control variables in the develop- 
ment of personality. Woolley (77) found cold, 
rejecting, or dominating and exploiting mothers 
to be “regularly” present in 100 cases selected 
only for the adequacy of their histories, but stipu- 
lated: 


These factors constitute the background for the 
children who escape as well as for their schizo- 
phrenic siblings. Moreover, there are families 
in which no schizophrenic denouements occur. 


TABLE 7 
MMPI Scores OBTAINED BY NORMAL CONTROLS 
Mean T Scores 
Scale u 
Female Male Traumatized Nontraumatized mato 
(N = 75) | (N = 57) (N = 37) (N = 95) 
? 50 50 
L 53 50 4.27 + 1.98 4.01 + 2.21 0.6 
F 50 53 3.78 + 2.98 2.88 + 2.13 1.66 
K 59 59 14.43 + 5.47 17.56 + 4.00 4.35 ° 
Hs 56 57 57.73 + 9.41 55.88 + 10.30 1.0 
D 57 5, 62.27 + 11.11 54.38 + 9.74 8.7310 
Hy 59 60 59.95 + 8.21 59.26 + 8.43 0.43 
Pd 55 60 57.46 + 11.53 57.69 + 9.08 0.11 
Mf 51 55 51.76 + 10.21 52.46 + 8.55 0.37 
p 56 53 54.43 + 7.87 53.77 + 7.93 0.44 
at 55 56 56.62 + 8.86 55.48 + 8.61 0.67 
Sc 55 57 56.81 + 9.64 55.83 + 6.86 0.56 
Ma- N 53 55 53.68 + 9.64 | 53.94 + 10.41 0.13 
——ı 


* Exceeds the .01 ley, el. 
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Evidently, there must E factors concerning the 
degree of rejection, its time of occurrence, its 
differential distribution among the siblings or 
the occurrence of reinforcing or ameliorating 
experiences. 

May it not be that the development of serious 
mental disorder will be less well understood if 
we concentrate solely on examination of patho- 
logical processes and injurious agents, rather than 
examining for the nature and extent of “immu- 
nizing” experiences? It seems necessary that we 
turn some of our research energies toward a dis- 
covery of those circumstances or experiences of 
life which either contribute directly to mental 
health and emotional stability or which serve to 
delimit or erase the effects of pathogenic events. 
For this purpose, we will need to make extensive 
psychological study of the biographies of normal 
persons as well as of patients, with such biog- 
raphies recorded so that their coverage and uni- 
formity facilitate analysis. 


SUMMARY AND CONCLUSIONS 


Through the medium of extended clinical in- 
terviews the life histories of 150 psychiatrically 
normal subjects were collected and subsequently 
recorded in detail on a research schedule which 
had been used previously in a study of the his- 
tories of 178 hospitalized schizophrenics. Of the 
150 normals, 119 were hospital or clinic patients 
being studied and treated for a wide Tange of 
serious physical illnesses. Selection of the nor- 
mals was made so that they were drawn from the 
same general population as the psychiatric pa- 
tients, and they were matched with the schizo- 
phrenics for age, sex, and marital status. 

Separate statistical analyses were made of the 
Teliability of the differences between the distribu- 
tions of the normals and schizophrenics on 35 
Major aspects of early history and adjustment. 
Of these 35 variables, 13 (or 37%) failed to 
Teveal a reliable difference between the two sam- 
Ples. On 5 of the 22 variables which yielded re- 
liable differences, the normals were characterized 
by greater frequency of the undesirable or patho- 
genic factor. Specifically, the normals had a 
greater frequency of poverty and invalidism in 
their childhood homes, poorer heterosexual ad- 
Justment and adequacy of sexual outlet, and a 
greater incidence of an intellectualized, ritualized 
Orientation toward religion. Additionally, the 
greater frequency of divorce in the childhood 
homes of the normals approached reliability. 
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The schizophrenics were characterized by re- 
liably higher incidence of unfavorable. relation- 
ships with mothers and fathers, poorer attitudes 
toward and achievement in school, less occupa- 
tional success and satisfaction, higher rates of 
social withdrawal, lack of social adeptness and 
poise, narrow interests, limited aspiration, vague 
life plans, and lack of initiative. These personal 
history characteristics which are predominant in 
the schizophrenics are in line with the general de- 
scription which has been made of the preschizo- 
phrenic personality and lend some support to the 
central concept of withdrawal. However, the 
extent to which these same characteristics were 
found in closely approximate proportions in the 
histories of the normals suggests the need for great 
reservation in interpreting the isolated schizo- 
phrenogenic potency of such factors as the 
mother-child relationship. 

The notion that any single circumstance, depri- 
vation, or trauma contributes uniformly and in- 
evitably to the etiology of schizophrenia is called 
into serious question. The necessity of studying 
the incidence of such factors in appropriate sam- 
ples is exemplified. It is suggested that the pat- 
terning of life experiences may be more crucial 
than occurrence or absence of specific psychic 
stresses. The finding of “traumatic” histories in 
nearly a fourth of the normal subjects suggests the 
Operation of “suppressor” experiences or psycho- 
logical processes of immunization. It is suggested 
that improved insights into mental illness may be 
afforded by careful, intensive studies of the life 
histories of normals. 
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MOTIVATION AND LANGUAGE 
BEHAVIOR: A CONTENT 
ANALYSIS OF SUICIDE NOTES * 


CHARLES E. Oscoop 
AND EVELYN G. WALKER + 


Whenever a person produces a message, whether 
it be conversation, an ordinary letter to a relative, 
or a suicide note, he employs a complex set of 


* Reprinted by permission from The Journal of Ab- 
normal and Social Psychology, July, 1959, Vol. 59, 
No. 1, 58-67. 

1 The authors wish to express their thanks to Joseph 
B. Casagrande of the Social Science Research Council 
for locating the suicide notes used in this study and 
to Edwin S. Shneidman of Los Angeles for lending 
them to us. The notes we used will appear in The 
Cry for Help (McGraw-Hill, 1958), to be edited by 
Edwin S. Shneidman. 
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encoding habits. It seems reasonable to assume 
that these language habits are organized in much 
the same way as the habits underlying nonlan- 
guage behavior and that the general principles of 
learning and performance therefore apply equiva- 
lently in both cases. This paper is concerned with 
the effects of motivation upon language behavior. 
It is assumed that the author of a suicide note— 
presumably written shortly before he takes his 
own life—is functioning under heightened motiva- 
tion. Therefore, the structure and content of 
suicide notes should differ from both ordinary 
letters and from simulated suicide notes in certain 
ways predictable from a general theory of be- 
havior. Following a brief theoretical discussion, 
we describe the application of a number of rele- 
vant content measures to a comparison, first, of 
suicide notes with ordinary letters to relatives and, 
second, of suicide notes with faked notes. Many 
of these measures differentiate in predicted ways 
suicide notes from normal control notes; a smaller 
number differentiate suicide from simulated sui- 
cide notes, suggesting that nonsuicidal individuals 
are able to adopt the state of the suicidal person 
in some respects but not in others. 

Language habits, like habits in general, appear 
to be organized into hierarchies of alternatives. 
We shall assume that increased drive has two 
distinct effects upon selection within such hier- 
archies: generalized energizing effects and specific 
cue effects (cf., 6 for a more complete analysis). 

The generalized energizing effects of drives are 
characterized by a nonspecific facilitation of all 
habits. Following the views expressed by Hebb 
(3), one may identify the generalized energizing 
effects of drives with arousal of a neural system 
in the brainstem from which there is diffuse, non- 
specific projection into the cortex, these impulses 
having a summative, “tuning-up” function. As- 
suming a multiplicative relation between habit 
strength and drive in producing reaction potential 
(cf., 10), the effect of increasing drive should be 
to make the dominant alternatives within all hier- 
archies even more probable relatively. Our first 
prediction, therefore: (A) Suicide notes will be 
characterized by greater stereotypy than messages 
produced under lower degrees of motivation, Sui- 
cide notes should therefore be more repetitious, 
less diversified in lexical content, use fewer ad- 
jectival and adverbial qualifiers, more familiar 
words and phrases, and so on. However, since 
the maximum strengths of habits are assumed to 
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be asymptotic, extreme increase in drive should 
force many competing habits toward a common 
maximum and hence produce interference and 
blocking. Therefore: (B) If extremely high levels 
of drive can be assumed, suicide notes should dis- 
play greater disorganization of language behavior. 
This would include various kinds of errors, break- 
ing up of messages into shorter units, and similar 
phenomena. 

To the extent that drive states are accompanied 
by distinctive sensations—e.g., thirst sensations, 
feelings of anxiety, sensations of pain—these dis- 
tinctive cues can become associated with certain 
alternatives within habit hierarchies through the 
operation of ordinary learning principles. The 
presence of such cues, as directive states, will have 
the effect of modifying the probability structure of 
behavioral hierarchies, increasing the probability 
of some alternatives, decreasing the probability of 
others. This leads to the following prediction: 
(C) Suicide notes should be characterized by in- 
creased frequency of those grammatical and lexi- 
cal choices associated with the motives leading to 
self-destruction. On a rather mundane level, this 
means that suicide notes should contain a rela- 
tively high frequency of self- and other-critical 
statements. Less obviously, they should contain 
a high frequency of what Skinner (9) calls 
“mands”—constructions of the demand, com- 
mand, request type that express needs of the 
speaker and require some behavior on the part 
of the listener for their satisfaction. Finally, if 
two or more motives are operating, and their cues 
are associated with selection of different alterna- 
tives within hierarchies, one may expect oscilla- 
tion between the responses associated with each 
State. Since it seems reasonable to assume that 
Suicidal people will often be functioning under 
competing motives, e.g., self-criticism vs. self- 
protection, spouse-aggression vs. spouse-affection, 
etc., we may predict that: (D) Suicide notes 
should be characterized by more evidence of con- 
flict than messages produced under non-suicidal 
States. Among indices of conflict would be use 
of constructions with but, however, if, and the 
like, qualification of verb phrases, and ambiva- 
lence in the assertions made about significant 
Persons. 


METHOD 


The suicide materials for this study consisted 
of two samples. The first was a set of 100 gen- 
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uine suicide notes, 50 written by men and 50 
written by women just prior to taking their own 
lives. These were obtained from Edwin S. 
Shneidman from his Los Angeles files. For com- 
parison purposes, we obtained a sample of ordi- 
nary letters written to 100 members of a panel of 
Ss in the Champaign-Urbana area; this panel had 
been used for other purposes in connection with 
research on the communication of mental health 
information. Since many of the quantitative 
measures we wished to make made it desirable 
that the messages include at least 100 words or 
so, the total sample was reduced to the following: 
40 male suicide; 29 female suicide; 13 male con- 
trol; 59 female control. The second set of mate- 
rials received from Edwin S. Shneidman consisted 
of 33 paired notes, one of each pair being a 
genuine suicide note and the other a simulated 
suicide note; a key to which was which accom- 
panied this set in a sealed envelope. We decided 
to use this set as a final test of our measures, after 
trying them out against the known suicide and 
normal letters. It was expected, however, that 
certain measures that would discriminate between 
suicide notes and ordinary letters probably would 
not do so between genuine and deliberately faked 
suicide notes—particularly measures reflecting the 
specific content of the message. 

Quantitative measures designed to test the four 
general predictions—intended indices of stereo- 
typy, of disorganization, of directive state, and of 
conflict—were devised and applied to the samples 
of known suicide notes and control letters. Six- 
teen measures were applied, along with certain 
additional analyses. Some of the measures are 
standard and well known in content analysis work; 
others were developed by us for this purpose. 
These were probably not the best measures that 
could have been devised, and they certainly do 
not exhaust the possibilities, but they do represent 
a considerable variety of quantitative estimates. 
The two investigators worked together in devising 
the measures, stabilizing the rules, and applying 
them to a small sample of notes. Each measure 
was then applied as consistently as possible to the 
total materials by one of us, not by both. We 
therefore have no direct evidence on the reliabil- 
ity of our measures across coders. For some of 
the measures, the objectivity of what was counted 
(e.g., number of repetitions, number of syllables 
per word) reduced the seriousness of this problem. 
Several of the less objective measures (e.g., evalu- 
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ative assertion analysis, distress-relief quotient, 
type/token ratios, cloze procedure) have been 
checked for reliability by their authors, and these 
reports are in our references. To avoid redun- 
dancy, the detailed description of the measures 
will be given in connection with the results ob- 
tained with them. 


RESULTS 


Suicide Notes vs. Ordinary Letters to Friends 
and Relatives—The differences between scores of 
males and females were tested for statistical sig- 
nificance separately within suicide and control 
groups. If no sex difference was found, the male 
and female letters within groups were combined 
and the total suicide vs. control samples were then 
compared statistically. If a sex difference did 
appear, separate analyses for differences between 
suicide notes and controls were made for each 
sex. Nonparametric tests of significance were 
used, generally the median test, and occasionally 
chi square. In the former case, levels of signifi- 
cance were evaluated by reference to the Main- 
land and Murray tables (5). Conservative esti- 
mates of the significance of the differences are 
given since a two-sided hypothesis was tested in 
spite of the fact that the direction of the difference 
was predicted in all cases. 

Stereotypy Measures —1. Average number of 
syllables per word. We would expect a person 
functioning under high drive to select words in 
terms of his strongest habits, i.e., familiar high 
frequency words. Since, as Zipf (12) has shown, 
there is an inverse relation between length of 
words and their frequency, and since longer, rarer 
words typically have more syllables, it follows 
that ordinary letters should have more syllables 
per word on the average than suicide notes. The 
total number of syllables per message, as esti- 
mated from breath pulses, was divided by the 
total number of words per message to obtain this 
index. There were no sex differences on this 
measure. Differences between suicide notes and 
control letters did not reach statistical significance 
but were in the expected direction. 

2. Type/token ratio (TTR). This measure is 
obtained by dividing the number of different words 
by the total number of words in each message. It 
has been shown to be a good index of lexical 
diversity, differentiating between educational lev- 
els, telephone vs. ordinary conversation, and so 
on (cf., 4). If high drive increases stereotypy, 
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we would expect suicide notes to display lower 
TTRs than ordinary letters to friends and rela- 
tives. , There were no sex differences on this 
measure, and differences between suicides and 
controls were significant at the .01 level in the 
predicted direction. 

3. Repetitions. Another index of stereotypy in 
messages is redundancy in what is talked about, 
We would expect people under high drive to Te- 
peat phrases more often than people under low 
drive. Here repetition of single words did not 
count (cf. TTR above), but phrases and parts of 
phrases of more than one word did. For exam- 
ple, in “. . . I really love you very much . ... 
and I really do love you . . . ,” the part phrase 
I really love you would count as repetition of 4 
words. For each message, the number of words 
repeated in this fashion was divided by the total 
number of words as an index of repetitiousness. 
Here, again, there were no differences between 
sexes, but the difference between suicide notes 
and ordinary letters was significant at the .01 level. 

4. Noun-verb/adjective-adverb ratio. This 
measure—a modification of the familiar verb/ 
adjective ratio (1) —was obtained by dividing the 
total number of nouns and verbs contained in the 
message by the total number of adjectives and 
adverbs. Definition of nouns, verbs, adjectives, 
and adverbs was done on the basis of whether 
the words could be substituted in linguistic test 
frames characteristic of the particular grammati- 
cal form. The rationale for the analysis is that 
under high drive states there should be less tend- 
ency toward modification of noun and verb forms, 
toward discriminative qualification of simple as- 
sertions, in line with our assumptions about the 
generalized energizing effects of drives. The prè- 
diction therefore follows that the ratio should be 
higher in the suicide than in the normal letters. 
The results bore out this prediction at the ‚ol 
level of confidence. 

5. Cloze measures. Taylor (1/) has devised a 
method of estimating redundancy or stereotypy 
in which a message is “mutilated” by substituting 
a blank for every nth word (say, every fifth word, 
as used here) and Ss try to fill in these missing 
items. Presumably, the more predictable the mes- 
sage as a whole, the more accurately Ss can pet 
form this task and, hence, the higher will be the 
cloze score. It follows that suicide notes should 
generate higher cloze scores than control notes. 
Subsamples of 10 male suicide, 10 male control, 
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10 female suicide, and 10 female control notes 
were mutilated by substituting blanks for every 
fifth word. Because sex differences in content 
might be significant here, we had 34 male Ss fill 
in the male notes and 31 female Ss fill in the 
female notes; suicide and control letters were 
alternated in order of presentation. Each S’s 
mean cloze score for the 10 suicide notes and the 
10 control letters was computed. A chi square 
test was used to determine whether the proportion 
of Ss having mean suicide cloze scores higher 
than their control scores deviated significantly 
from chance. For male Ss completing male ma- 
terials, differences were significant at the .01 level 
in the expected direction; for females completing 
female material, however, there were no differ- 
ences whatsoever. 

6. Allness terms. People speaking or writing 
under high drive or emotion could be expected to 
be more extreme or polarized in their assertions. 
They should use more terms that permit no ex- 
ception, e.g., always, never, forever, no one, no 
more, everything, everyone, completely, perfectly, 
and so on. Strictly speaking, this is not a meas- 
ure of stereotypy, but it should be affected by 
generalized drive level. The number of such 
terms in each message was divided by total words 
and expressed as a rate per 100 words. Suicide 
notes yielded significantly more allness terms (.01 
level), and there were no sex differences. 

Disorganization Measures.—1. Structural dis- 
turbances. Extremely high levels of drive should 
result in disruption of the myriad of delicately 
balanced language encoding habits, according to 
theoretical analysis. To obtain a disturbance 
measure, the coder took the attitude of an English 
composition teacher, noting all grammatical, syn- 
tactical, spelling, and punctuation errors, and even 
clearly awkward constructions. Points where ma- 
terial was obviously omitted, e.g., “I don’t ( ) 
him any more,” were also counted. The index 
was the number of such errors expressed as a 
rate per 100 words. There were no sex differ- 
ences here and no significant differences between 
suicide and control notes, although the latter dif- 
ference was clearly in the expected direction. 

2. Average length of independent segments. 
We assume that people encoding under stress will 
tend to break their utterances into short, explosive 
units. Here we are interested in sentence length, 
but must correct for compound sentences joined 
together by conjunctions like and and but. The 
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coder divided each message into the number of 
segments that could stand by themselves as sen- 
tences. The index was the total number of words 
in each message divided by the number of such 
segments, yielding the average number of words 
per independent segment. Although there were 
no sex differences for control letters, there were 
for suicide notes, male suicides using significantly 
longer segments (.05 level). Comparing suicides 
with normals, we find no differences for females 
but a difference significant at the .05 level for 
males—male suicide notes used significantly 
longer independent segments than ordinary letters 
written by males, a finding that is contrary to the 
direction predicted. 

Directive State Measures.—1. Distress/Relief 
Quotient (DRQ). This well-known measure de- 
veloped by Dollard and Mowrer (2) is the ratio 
of distress-expressing phrases to the sum of these 
plus relief-expressing phrases, the former being 
indicative of disturbing drive states and the latter 
of the reduction of such states. This measure 
obviously depends to a considerable degree on the 
judgment of the coder. Here we found definite 
sex differences for both control and suicide mes- 
sages; females yielded higher ratios (more distress- 
expression), in both cases significant at the .05 
level. This difference may reflect a trait of mas- 
culine reticence in our culture. And, as might 
be expected from the nature of the suicide situa- 
tion, both male and female suicide notes displayed 
higher DRQs than ordinary letters to friends and 
relatives (.01 level in both cases). 

2. Number of evaluative common-meaning 
terms. Common-meaning terms in a language 
are those upon whose denotation and connotation 
people must agree if they are to understand one 
another. Examples would be sweet, round, table, 
thunder, run, eat, and so on. They are in con- 
trast to attitude objects, like labor union and ex- 
Senator McCarthy, upon whose connotative mean- 
ings, at least, communicators need not agree. 
Evaluative common-meaning terms are those, like 
unfair, dangerous, sweetheart, and drunkard, 
which can be judged as clearly related to either 
good or bad: Our index here is simply the total 
number of such terms in each message divided 
by the total number of words in the message. 
There are no differences between sexes for either 
suicides or normals in the simple number of eval- 
uative common-meaning terms (in contrast to the 
distress/relief measure above and percentage of 
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positive evaluative assertions below), but differ- 
ences between suicides and normals are significant 
at the .01 level, with suicides having more evalu- 
ative terms. 

3. Positive evaluative assertions. Evaluative 
Assertion Analysis has been described in detail 
elsewhere (7, 8). In essence, it is concerned with 
the linguistic isolation of statements that assert a 
relation between an attitude object and either 
another attitude object or an evaluative common- 
meaning term. Examples: J (EGo)/have always 
respected/you (FATHER); You (spouse) /could 
never stand being/simply a loyal helpmate. As- 
sertive relations can be either associative (have 
always respected) or dissociative (could never 
stand being). For present purposes, analysis was 
restricted to those attitude objects representing the 
significant persons in Ego’s life and their relation 
to Ego; e.g., EGO, ALTER (person written to when 
not spouse, parent, or child), SPOUSE, CHILD, 
MOTHER, FATHER. The index with which we are 
presently concerned was obtained by dividing the 
number of positive evaluative assertions by the 
total number of evaluative assertions, positive and 
negative, i.e., the proportion of positive evalua- 
tions. It would be expected that this measure 
would correlate highly and negatively with the 
Distress/Relief Quotient, and it does. Although 
sex differences were not significant, they were in 
the direction of greater negative evaluation by fe- 
males in both suicide and normal letters. Differ- 
ences between suicide and ordinary letters home 
were significant at the .01 level and in the ex- 
pected direction. 

4. Time orientation. It was expected that the 
motivational state characteristic of suicide might 
direct interest of the writer away from the present 
toward the past. Therefore, suicide notes should 
contain fewer statements referring to the present 
and the future but more referring to the past. 
Examples: present reference—I love you, I’m 
afraid that . . . ; past references—I have tried 
. . . Everything you’ve done . . . ; future refer- 
ence— ... who will always love you, Tell my 
parents. . . . We measured both the proportion 
of total references which were to present time and 
the imbalance of nonpresent references toward 
past vs. future. Contrary to our expectations, 
there were neither significant differences between 
sexes nor between suicides and controls. 

5. Mands. According to Skinner (9), a mand 
is an utterance which (a) expresses a need of 
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the speaker and which (b) requires some reaction 
from another person for its satisfaction. It is usu- 
ally expressed in the form of an imperative, where 
the verb comes early in the utterance (e.g., Don’t 
feel too bad about this or Please understand me), 
but is not restricted to this form (e.g., I wish 1 
could see you, I hope you understand, or May 
God forgive me). Our index was the number of 
such constructions, expressed as a rate per hun- 
dred words. This proved to be one of the most 
useful measures in our arsenal. In the present 
test situation, differences between sexes were not 
significant, but differences between suicide notes 
and ordinary letters to friends and relatives were 
significant at the .01 level in the predicted direc- 
tion. 

Conflict Measures.—1. Qualification of verb 
phrases. When a speaker or writer is in conflict 
about the topics being discussed, it seems likely 
that he will modify or qualify his assertions away 
from the flat, direct present or past tense, e.g., 
from I was good to you to something like I used 
to be good to you, or I tried to be good to you. 
To quantify this characteristic of messages, the 
coder first bracketed each complete verb phrase 
for which a single yerb could be substituted, e.g., 
for I (could have helped) you more we can sub- 
stitute I (loved) you more, where the one word 
loved substitutes structurally for the three words 
could have helped; then the coder totaled the num- 
ber of words in these brackets and divided by the 
number of such brackets. The larger this ratio, 
the greater the amount of excess, qualifying ma- 
terial. There were no sex differences on this 
measure, but differences between suicide and nor- 
mal. letters were significant at the .01 level in the 
expected direction. 

2. Ambivalence constructions. There are à 
number of syntactical constructions in English 
that may directly express ambivalence, conflict, 
and doubt on the part of the speaker: but, if, 
would, should, because (for, since), well, however, 
maybe, probably, possibly, seems, appears, guess, 
surely, really, except, etc. Certain question forms 
also express the same indecisive state, e.g., Must 
I do it? Why do I try at all? The coder deter- 
mined the number of such forms in each message 
and expressed it as a rate per 100 words. Differ- 
ences between sexes were not significant; differ- 
ences between suicide notes and ordinary letters 
were significant at the .01 level. 
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3. Percentage of ambivalent evaluative asser- 
tions. The essential nature of evaluative assertion 
analysis has already been described (see Directive 
State Measures numbered 2 and 3). If a speaker 
displays perfect consistency or lack of ambiva- 
lence, then all of the assertions relating to each 
attitude object or association between each pair 
of attitude objects will have the same sign. For 
example, assertions concerning the self would be 
either consistently positive or consistently negative 
—I am no good; I have been a failure; Luck has 
not been on my side; or for the relation of Ego to 
Spouse, Z have always loved you; You relied on 
me; I tried to help you; My Darling Wife. Am- 
bivalence, on the other hand, is indicated by asser- 
tions of different signs in the same set, e.g., Z love 
you, Honey; You never trusted me; I quarreled 
with you; You stuck by our marriage, though. 
Our index of assertion ambivalence was the total 
number of deviant assertions (i.e., the number of 
least frequent signs in each set, summing over 
sets) expressed as a proportion of total assertions. 
For this conflict measure, also, there were no sex 
differences, but suicide vs. control differences were 
significant at the .01 level and in the predicted 
direction. 

Genuine vs. Simulated Suicide Notes.—We had 
originally planned to apply the measures that had 
successfully differentiated suicide from control let- 
ters blindly to the entire set of 33 paired suicide 
and simulated notes. Unfortunately only 13 of 
these pairs included both suicide and faked notes 
of sufficient length to make most of our measures 
meaningful. In attempting to predict which of 
these 13 pairs were the genuine suicide notes, we 
eliminated those measures which had failed to 
differentiate suicides from normal controls (struc- 
tural disturbances, average length of independent 
Segments, time orientation), those which obvi- 
ously and nonsubtly reflected the suicide topic 
and hence would be readily faked (distress-relief 
quotient, evaluative terms, positive evaluative as- 
Sertions), and the cloze procedure (too few mes- 
Sages of sufficient length). A prediction of sui- 
cide vs. simulated was made for each of the 13 
paired notes on the basis of each of the remaining 
nine measures, and the final prediction was based 
on which note in each pair garnered the most 
Suicide votes. Our quantitative predictions proved 
to be correct in 10 out of 13 cases, a value signifi- 
cant at the ‚05 level. 

However, before checking the accuracy of these 
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quantitative predictions against the key in the 
sealed envelope, both authors independently as- 
signed all 33 pairs to suicide or fake categories 
on an intuitive basis. One of us got *14s correct 
and the other 2%s. To check on the possibility 
that we had actually been employing cues derived 
from our previous quantitative coding, we had 
eight graduate students with no prior experience 
with these notes assign the pairs to genuine sui- 
cide and fake categories. They were successful 
on the average in 16.5/33 cases, exactly chance. 
So it would appear that familiarity with a large 
sample of known suicide vs. nonsuicide notes, or 
sensitivity to cues derived from quantitative meas- 
ures, or both, contributes to successful identifica- 
tion of genuine suicide notes. 

Knowing which of the 33 pairs are genuine 
suicide notes, we may now ask which of our 
quantitative measures, successful in differentiating 
suicide from ordinary letters home, are also suc- 
cessful in differentiating genuine suicide from 
pseudo notes. This analysis should indicate which 
encoding characteristics of the suicidal individual 
can be intuited and, hence, faked by the nonsui- 
cidal person, and which cannot. We may look 
first at the very small sample of 13 pairs where 
both genuine and pseudo notes could be coded. 

Stereotypy Measures: Of these measures, three 
(syllables per word, repetitions, and allness terms) 
were clearly in the expected direction but not 
significantly so. One, noun-verb/adjective-adverb 
ratio was significant at the .05 level in the pre- 
dicted direction. 

Directive State Measures: Of the directive state 
measures, DRQ, frequency of evaluative terms, 
and proportion of positive evaluative assertions 
did not differentiate (as expected), but mands did 
differentiate significantly at the .05 level. 

Conflict Measures: Of the three conflict meas- 
ures, one was not significant (qualification of verb 
phrases), one was significant at the .05 level, but 
in the wrong direction (ambivalence construc- 
tions) and one was barely significant in the pre- 
dicted direction at the .10 level (proportion of 
ambivalent assertions). 

If we enlarge our sample to 24 suicide and 18 
faked notes by scoring all notes of sufficient 
length, regardless of their pairing, about the same 
results appear: Stereotypy measures tend in the 
right direction, but only the noun-verb/adjective- 
adverb ratio significantly so; mands just miss sig- 
nificance at the .05 level; proportion of ambiva- 
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lent assertions approaches significance in the ex- 
pected direction, but the other conflict measures 
are either nondifferential or significant in the 
wrong direction. Interestingly, the two disorgan- 
ization measures, which were computed for this 
larger sample, approach (structural disturbances) 
or reach (length of independent segments) signifi- 
cant at the .05 level and in the predicted direction. 

Although gross measures of “what is talked 
about” like the DRQ and negative evaluative as- 
sertions may not differentiate genuine from pseudo 
suicide notes, we may ask if a more detailed con- 
tent analysis might reveal differences. Accord- 
ingly, the frequency with which lexical words 
(nouns, verbs, adjectives, and adverbs) were used 
in the 33 genuine and facsimile suicide notes was 
analyzed. Since the sample of Ss was small, and 
hence liable to bias by discussion of a particular 
topic by a single S, subject-frequencies rather than 
word-frequencies per se were counted, i.e., the use 
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of a given word by a given S was only counted 
once no matter how often he used it. 

Table 1 gives the words included in five or 
more of the 33 suicide notes and in five or more 
of the simulated notes. Again, we note evidence 
for greater stereotypy in the suicide group; in 
every category, suicide notes display a greater 
sharing of common lexical items than do simu- 
lated notes. Some of the differences in choice of 
most frequently used lexical items are interesting: 
Suicide notes are more heavily loaded with terms 
of endearment (darling, dear, honey) and refer- 
ences to mother, whereas faked notes have more 
abstractions (life, way, all) and references to 
insurance. Whereas genuine suicide notes are re- 
plete with verbs referring to simple action (tell, 
do, get, say, take, give), faked suicide notes in- : 
clude relatively more verbs referring to mental 
states (know, think, seem, see). The genuine 
suicides have more stress on positive states (love 


TABLE 1 


Worps INCLUDED IN FIVE OR MORE OF THE GENUINE AND SIMULATED SUICIDE NOTES 


Nouns Verbs Adjectives and Adverbs 
Genuine Simulated Genuine Simulated Genuine Simulated 
everything (9) life (13) love (19) know (10) good (15) good (9) 
way (out) (9) way (11) tell (13) leave (9) sorry (11) sorry (8) 
wife (9) way (out) (8) | know (12) think (9) only (7) happy (6) 
love (8) thing (8) hope (11) have (8) dear (6) all (5) 
mother (8) wife (7) please (11) please (8) bad (5) 
thing (8) all (8) think (11) love (7) 
God (6) love (5) do (10) forgive (6) 
time(s) (6) insurance (5) get (9) hope (5) 
darling (5) say (9) seem (5) 
dear (5) take (9) see (5) 
honey (5) give (8) tell (5) 
life (5) want (8) 
one (5) feel (7) 
person (5) goodbye (7) 
something (5) have (7) 
trouble (5) make (7) 
way (5) go (6) 
year(s) (5) help (6) 
see (6) 
forgive (5) 
try (5) 
take care of (5) 
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19 and hope 12 vs. 7 and 5 for the same words 
in pseudo notes), whereas the simulated notes 
have 9 references to leave vs. only 3 for suicides. 
To summarize this rough content comparison, 
genuine suicide notes reflect ambivalence toward 
loved ones through higher frequency of positive 
evaluative terms (cf., ambivalent assertions above) 
and they reflect greater concreteness. 

Finally, a contingency analysis * of some of the 
major content categories in suicide and pseudo 
suicide notes was made in an attempt to get at the 
association structures characteristic of the two 


TABLE 2 


RELATIVE FREQUENCY OF CASES REFLECTING 
VARIOUS CONTENT CATEGORIES 


Categories Genuine Simulated 

Spouse praise, defense, love .69 42 
Self criticism 48 +39. 
I’m sorry; forgive me AS .36 
Self praise, defense 39 06 
Children 36 33 
Goodbye, farewell, etc. -30 112 
Feelings of confusion, 

being tired, etc. .27 .42 
Spouse criticism .24 .03 
“Way out” .24 39 
Physical disabilities, 

symptoms Al 18 
Parents 21 .06 
God and religion 21 .06 
Material possessions 21 .00 
Reference to suicidal act .18 .33 
Money, bills, debts AS 06 
Notify, tell someone 15 .03 
Isolation, loneliness “15 .00 
Insurance, etc. ‚12 PAN 
Reference to suicide note 12 .06 
Instructions about own 

remains gka .03 
Job .09 .03 
Love triangles (other man, 

woman) .09 .06 
“Fate,” “Life,” “World,” ete. .03 33 
Sex relations .03 .00 


groups. The content categories given in Table 2 
were used. Before discussing the results of the 
contingency analysis, some of the differences in 


_? This method is described in some detail in Pool 
(in press). 
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relative frequency of reference to these categories 
in the genuine and spurious suicide notes are 
worth noting: Suicide notes have relatively higher 
frequencies of reference to self praise and defense, 
good-byes and farewells, criticism of the spouse, 
references to parents, God and religion, and ma- 
terial possessions; simulated notes refer relatively 
more frequently to feelings of confusion, being 
tired, and the like, to the suicidal act itself, to 
insurance, and to abstractions like Fate, Life, and 
the world. 

Expected and obtained contingencies among 
these categories (for genuine suicide and simu- 
lated notes separately) were obtained in the fol- 
lowing way: Letting A and B represent two con- 
tent categories, the expected contingency is Pan, 
i.e., the probability of both A and B being present 
in notes of a given type, based on their separate 
rates of occurrence. The obtained contingency is 
simply the relative frequency of actual co-occur- 
rence, i.e., the percentage of notes of a given 
type in which contents A and B are actually both 
present. The obtained contingency may be either 
greater than (association) or less than (dissocia- 
tion) the expected or chance contingencies. Sig- 
nificances of deviations from chance expectancies 
are estimated in terms of the standard error of 
the expected percentage. Because of the crude 
nature of this analysis and the rather small N, 
significances at the .10 level or better were used 
as the basis for the following summary statements; 
they should be considered to be merely suggestive. 

In the genuine suicide notes we find criticism 
of the spouse associated with references to insur- 
ance, money, bills and debts, and requests to 
notify someone of his death. As might be ex- 
pected, requests to notify are associated with ref- 
erences to the suicide note itself and with instruc- 
tions about handling one’s remains. Expressions 
of feeling isolated and lonesome are associated 
with references to money, bills and debts, and to 
love triangles. References to the parents appear 
with statements about taking a “way out” and with 
references to material possessions. References to 
own children appear with instructions about han- 
dling remains. Again as would be expected, ref- 
erences to money, bills and debts co-occur with 
references to material possessions. Less obvi- 
ously, references to the job are contingent upon 
references to the suicide note itself; self praise is 
contingent upon references to insurance. 

In the simulated notes, references to own chil- 
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dren are associated with references to God and 
religion, while references to parents are associated 
with stereotyped abstractions, Fate, Life, the 
World. When people write “make-believe” sui- 
cide notes, references to the suicide act itself tend 
to be accompanied by references to God and re- 
ligion, and talking about a “way out” appears 
with comments about money, bills and debts. 
Expressions of sorrow, regret, and asking for for- 
giveness appear with saying goodbye and farewell 
in these faked notes. There are also two signifi- 
cant dissociations (co-occurrence less than chance 
at the .10 level)—faked notes that speak of 
physical disabilities do not express feelings of con- 
fusion and being tired, and the notes which refer 
to insurance do not include expressions of sorrow, 
regret, and asking for forgiveness. 

In viewing the total evidence on association 
structures, one gets the following general impres- 
sion: When people produce fake suicide notes 
“on demand,” they generally embroider a few 
standard themes available in our folklore—taking 
a “way out” of financial and other problems, ask- 
ing forgiveness and saying “farewell,” pondering 
on the moral and religious implications of taking 
one’s own life, and so on. The patterns of associ- 
ation in genuine suicide notes suggest more mun- 
dane connections—for example, criticism of the 
spouse being connected in the suicidal person’s 
mind with his financial problems, with being in- 
sured and the like, or references to being insured 
being coupled with self-praise. 


DISCUSSION 


Of the four general predictions about the effects 
of heightened motivation upon encoding, three 
are borne out clearly in the comparison of suicide 
notes with ordinary letters to friends and relatives. 
Suicide notes display greater stereotypy—the 
writer of a suicide note tends to use shorter, sim- 
pler words, his vocabulary is less diversified, he is 
more repetitious, he uses more simple action ex- 
pressions (nouns and verbs) and fewer discrimi- 
native qualifiers (adjectives and adverbs), and his 
messages are more easily filled in (cloze proce- 
dure) by others. He also uses more polarized 
“allness” terms. The effects of the suicidal direc- 
tive state are also clearly evident—in higher dis- 
tress-relief quotients, in the greater frequency of 
evaluative terms, and in the smaller proportion of 
evaluative assertions that are positive in direction. 
Suicide notes also display the demanding, com- 
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manding, pleading nature of this state by higher 
frequency of mands. Suicide notes yield evidence 
of greater conflict of motives—by greater qualifi- 
cation of verb phrases, more ambivalence con- 
structions, and a larger percentage of evaluative 
assertions about Ego and significant others that 
are ambivalent in sign. Most of these differences 
were significant at the .01 level, and they sub- 
stantiate our general hypotheses about the effects 
of heightened drive level upon language encoding. 

One major prediction was not borne out: there 
was no evidence for greater disorganization of 
encoding behavior in suicide notes as compared 
with ordinary letters to friends and relatives. We 
conclude that the suicide state, at least at the time 
a note is penned, does not represent a sufficiently 
high degree of motivation to cause disruption of 
language skills, but the negative result could also 
indicate that our measures of disorganization were 
inadequate or that the hypothesis was wrong, 
The failure of the time orientation measure to 
yield any differences may also mean that our 
notions were wrong. It is also possible that this 
measure was confounded with that for mands; 
mands usually have future reference and are sig- 
nificantly more frequent for suicide notes. 

The comparison of genuine suicide notes with 
simulated suicide notes, matched for age, educa- 
tion, and general social status, can be considered, 
on the one hand, a more stringent test of the 
hypotheses or, on the other hand, an indication 
of the degree to which a nonsuicidal person can 
intuit and adopt the encoding content and style 
of the suicidal person. From the former point of 
view, we would have to conclude that most of our 
measures fail to distinguish significantly between 
the genuine and pseudo notes (excepting the noun- 
verb/adjective-adverb ratio, mands, length of in- 
dependent segments, and perhaps the proportion 
of ambivalent assertions). Nevertheless, the 
quantitative indices that differentiated suicide 
from ordinary letters, and which might be €x- 
pected to differentiate genuine from faked notes, 
did so for 10 out of the 13 matched pairs 10 
which they could be applied. 

How well can nonsuicidal writers adopt the 
encoding content and style of the suicidal state? 
First, nonsuicidal people can obviously intuit the 
superficial content of suicide notes—the distress- 
expression, the use of evaluative terms, and the 
decrease in positive evaluative assertions. Less 
superficially, however, we note interesting differ- 
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ences in the words used (more positively toned 
terms, expressing the ambivalence of the true sui- 
cidal state, and more concrete terms generally) 
and in the contingencies among content categories 
(less stereotyped, “story-book” associations in the 
true suicidal cases). Second, although the over- 
all reduction in significance of differences, as com- 
pared with the suicide vs. control analysis, shows 
that the style of the suicidal person can be adopted 
to some degree by a person merely instructed to 
write such a note, there are certain exceptions. 
The person faking a suicide note fails to reflect 
the demanding, commanding, pleading style 
(mands), the reduced qualification (noun-verb/ 
adjective-adverb ratio), and the evaluative am- 
bivalence toward self and others of the genuine 
suicide notes. 

One criticism that could be leveled at this study 
is that other determinants than motivation might 
be responsible for the results. It is known that 
many of the indices used here as tests of motiva- 
tional effects can be affected by other source char- 
acteristics as well. For example, stereotypy meas- 
ures like length of words and TTR are influenced 
by the education and IQ level of the source. Our 
control sample of ordinary letters to friends and 
relatives could be matched with the suicide notes 
in terms of sex and age, but that was all. Could 
the differences we found be accounted for simply 
on the basis that our suicide note writers were 
less intelligent and/or less well educated? If we 
explain the differences in stereotypy in this way, 
we are unable to explain why the same notes 
showed no differences in structural disturbances 
(ordinary English composition, for the most part) 
and, in fact, for males showed longer integrated 
sentence segments. Furthermore, this would not 
explain the directive-state differences (e.g, in 
mands). Also, in the genuine-pseudo compari- 
son, where these factors were controlled by match- 
ing, differences for the most part were in the 
Same direction, although not as large. 


SUMMARY 


Theoretical analysis of the effects of motivation 
level upon language encoding led to several hy- 
Potheses. Messages produced under heightened 
drive level should (a) be more stereotyped, (b) 
be more disorganized, if the motivation level is 
extremely high, (c) reflect the specific nature of 
the motives operating, and (d) reflect conflict of 
Tesponses if two or more competing motives are 
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operating. These hypotheses were tested by (a) 
a comparison of suicide notes with ordinary letters 
to friends and relatives and (b) a comparison of 
genuine suicide notes with simulated suicide notes, 
written by nonsuicidal people. In the first com- 
parison, all of the hypotheses were clearly borne 
out except that concerning disorganization of en- 
coding skills. In the second comparison, differ- 
ences were smaller, only certain measures, the 
noun-verb/adjective-adverb ratio, Skinner’s mands, 
length of sentence segments, and proportion of 
ambivalent evaluative assertions still discriminat- 
ing significantly. Implications of these results for 
psycholinguistic theory and for stylistics are con- 
sidered in the discussion, along with certain criti- 
cisms that could be made of the study. 
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SELF-REFERENCE IN 
COUNSELING INTERVIEWS * 


Victor C. RAMY + 


Successful counseling or psychotherapy implies 
that changes take place in the personality of the 
client or patient. The problem of the psycholo- 
gist who seeks to analyze conscious, deliberate at- 
tempts to alter behavior in a clinical situation, lies 
essentially in determining specifically what changes 
take place during treatment and what conditions 
are necessary to produce them. Many things have 
been suggested as “essential changes” but so far 
there is almost a complete dearth of reliable evi- 
dence, aside from clinical interpretations, to bol- 
ster the constructs which have been advanced. 
Among the terms in common use which sup- 
posedly define such changes are release of feelings 
or tensions, emotional re-education, making the 
unconscious conscious, modifying responses, re- 
organization or reintegration of personality, modi- 
fication of goals or pathways to goals, etc. 

Such terms are so schematic, so indefinite or 
so circular that the very existence of the psycho- 
logical events to which they refer cannot be deter- 
mined by objective measures or else every event 
considered fits the theory. Changes in “needs” 
or in “attitudes” or in “traits” have also been sug- 
gested as occurring in therapeutic situations as 
well as in normal personality development. So 
far there has been little systematic application of 
these concepts to the artificially induced changes 


* Reprinted by permission from the Journal of 
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occurring during treatment. As for trait con- 
structs, in clinical work these have usually been 
applied to manifestations of ability or capacity 
which the counselor is more interested in un- 
masking than in modifying. 

It becomes imperative to search for constructs 
which can be investigated with the usual safe- 
guards of objectivity and reliability. When and 
where to apply available and plausible measures 
is at present a matter of feasibility rather than 
desirability. The methods of measurement are 
crude and time consuming, Clinicians are usually 
unwilling to disturb both their own delicate meth- 
ods of treatment as well as the precarious balance 
achieved by the individuals treated. The appli- 
cation of measuring methods has been a matter 
of catch-as-catch-can, although recent develop: 
ments in the use of recording equipment open a 
new field for the analysis of hitherto unobtainable 
data within which may be found some of the 
essential clues. Unfortunately, techniques for 
analyzing verbatim interview material are in their 
infancy. 

Most certainly, the events taking place in the 
counseling interview are the events which furnish 
a very logical focus and locus of investigation. 
It is equally certain that even verbatim recordings 
furnish only selected portions of the interview 
events. Such recordings contain only the re- 
marks of the counselor and the client’s verbaliza- 
tions, most of which are concerned with himself 
and his relations to other people. How can one 
abstract from such material clues which are rele- 
vant to the basic problem of how personality un- 
dergoes change? 

The Self-Concept and Personality Organization. 
—Perhaps in the search for dominant factors in 
personality the obvious has been too long neglected 
because it is obvious and therefore deemed un 
worthy of attention. In our sophisticated inter- 
course with personality dimensions which require 
for their comprehension a neologism, a factor 
analysis or a re-redefinition, it is only too easy to 
neglect in the laboratory much of the wisdom we 
use in the parlor or office. 

What a person believes about himself is a gen- 
erally accepted factor in the social comprehen- 
sion of others. Peculiar or just different behavior 
in our associates can frequently be understood by 
cliches such as “He has an inferiority complex" 
or “She is conceited.” In such thumbnail analyses 
we are referring to a description of himself which 
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the person referred to has apparently accepted and 
acts upon. The analysis, if accurate, is useful in 
understanding others even when we are ignorant 
of the historical development of the self-belief, as 
is usually the case. 

In the present study, the Self-Concept theory 
postulates that a person’s notion of himself is an 
involved, complex and significant factor in his 
behavior. One can build a systematic theory of 
personality organization which neglects neither his- 
torical nor physiological events yet depends essen- 
tially on the data of immediate experience. Briefly, 
the Self-Concept theory (or the Body-Schema of 
Schilder or the Ego of Koffka) predicates that 
each individual’s perception of himself is of ulti- 
mate psychological significance in organized be- 
havior. The person in his biological, social and 
historical setting is the concrete object of self-per- 
ception. The Self-Concept is the more or less 
organized perceptual object resulting from present 
and past self-observation. Self-perception is a 
process which is more than activation of internal 
or distance receptors. In agreement with text- 
book definitions, there is in self-perception an 
organization which involves memorial and situa- 
tional factors as well as the sense data themselves. 

To oversimplify the theoretical position, one 
can say that we perceive ourselves just as we per- 
ceive a chair or another person. What we per- 
ceive in ourselves (the Self-Concept) may have 
only partial correspondence with what other peo- 
ple perceive in us or the so-called objective per- 
sonality. Yet, as always, we behave in accord- 
ance with our own perceptions even though the 
opinions of others or the urgencies of our bio- 
logical makeup interact to influence our percep- 
tions of ourselves. Our general behavior, then, is 
to a large extent regulated and organized by what 
we perceive ourselves to be just as behavior toward 
a chair is regulated by our perception of a given 
chair. When tired we observe what appears to 
be a rickety antique and walk away in search of 
something more substantial quite unaware that 
the wary hostess has introduced reinforcements 
at the joints. In the same way, we may perceive 
ourselves to be fatigued and spend hours resting 
when the physiological facts may or may not be 
in agreement with our self-perception. 

How can Self-Concept theory be applied to 
those changes which take place in personality 
during counseling and psychotherapy? It is ex- 
ceedingly difficult to obtain objective and mean- 
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ingful data from the verbalizations which are the 
raw material of immediate experience. Some 
studies of content changes before and after coun- 
seling have been made by the writer but these 
are fragmentary approaches. The method chosen 
in the present study postulates that self-approval 
and self-disapproval represent two ends of a con- 
tinuum which may be viewed as one of the major 
dimensions of the Self-Concept. McDougall’s 
“sentiment of self-regard” obviously refers to this 
fact: that persons make self-evaluations. He says 
(4, p. 428) concerning the ubiquity of this sen- 
timent: 


And the conative dispositions of the system, 
being brought into play so frequently, by every 
social contact whether actual or imagined, be- 
come delicately responsive in an extraordinary 
degree, as well as very strong through much 
exercise. 


This study was therefore directed toward simple 
quantitative analysis of changes in self-approval 
displayed by college student clients. Fourteen 
complete series of counseling interviews were 
analyzed. 

The basic postulations of the study were as fol- 
lows. The Self-Concept is the map which each 
person consults in order to understand himself, 
especially during moments of crisis or choice. 
The approval, disapproval or ambivalence he 
“feels” for the Self-Concept or some of its sub- 
systems is related to his personal adjustment. A 
heavy weighting of disapproval or ambivalence 
suggests a maladjusted individual since maladjust- 
ment in a psychological sense inevitably implies 
distress or disturbance in connection with oneself. 
When successful personality reorganization takes 
place in a maladjusted individual we may also 
expect a shift from self-disapproval to a positive 
or self-approving balance. The adjusted individ- 
ual may dislike or disapprove of certain aspects 
of the Self-Concept but in general he finds him- 
self to be attractive and desirable. The studies 
of self-ratings. presumably carried out on unse- 
lected populations, corroborate the final postula- 
tion. Allport summarizes (J, p. 444), “In self- 
ratings there is a tendency to overestimate those 
qualities considered desirable and to underesti- 
mate those considered undesirable.” 

The present study was limited by the small num- 
ber of completely recorded counseling cases avail- 
able. Despite this limitation, other investigators 
have used similar methods and emerged with 
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fruitful hypotheses if not completely defensible 
conclusions. These studies, beginning with Por- 
ter’s (5) highly original exploitation of the check 
list method with interviews, are carefully reviewed 
in Curran’s recent book (3). 

The Method of Classifying Self-References.— 
In order to quantify the verbalized changes in 
self-approval taking place during counseling, a six 
category checklist was devised to permit classifi- 
cation of all client utterances. Reproduced below 
is a summary of the method used to classify ver- 
batim transcripts of 14 completely recorded coun- 
seling cases. The summary is an abridged form 
of a six page set of instructions given to four 
judges who participated in determination of relia- 
bility. 


SUMMARY OF DIRECTIONS FOR CLASSIFYING 
SELF-REFERENCES 


Purpose: To classify the responses of clients in 
counseling interviews into categories based on the 
client’s attitude toward himself. It is hoped that 
this procedure will be useful as a means of show- 
ing changes in the client during the course of 
counseling. 

Definition of Self-References: A Self-Reference 
(SR) is a group of words spoken by the client 
which directly or indirectly describes him as he 
appears in his own eyes. In a counseling inter- 
view the client is usually discussing himself and 
his reactions. Responses of the client which are 
not self-references are called External References 
or “Other” and are symbolized as “O.” 

The Unit Is the Client’s Complete Response: A 
complete response consists of all words spoken by 
the client between two responses of the counselor. 
Each numbered response of the client in the type- 
script is to be classified for its self-reference evalu- 
ation or lack of self-reference. For purposes of 
brevity, the self-evaluating aspect of the response 
is referred to as its “value,” which is simply the 
positive or negative attitude toward self manifest 
in the response. 

The Task of the Judge: 1. Read through the 
complete interview or portion of an interview in 
order to familiarize yourself with the general con- 
tent of the material and the manner of speaking 
of the client and counselor. 2. Starting with the 
first numbered client response, decide whether 
it is a self-reference or external reference. 3. If 
it is an SR, classify it in one of the categories 
described below. 4. The decision as to the mean- 
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ing intended by the client in a particular response 
should rest upon a common-sense deduction and 
should not be a tenuous theoretical deduction 
based upon a series of inferences. 

The Six Categories: 


Piss i . Positive SR indicating a posi- 
tive or favoring attitude toward 
self. 

Nur; Å . Negative SR indicating a nega- 


tive or disapproving attitude 

toward self. 

Ambivalent SR in which there 

is a clear conflict between the 

positive and negative attitudes 
toward self in the same re- 
sponse. 

AW « 4 . Ambiguous SR in which some 
self-reference is manifest but 
either the value is too vague to 
be classified or the response 
lacks value altogether. 

On: h . Other or External Reference in 
which the client himself is not 
implicated. 

EN F . A nonrhetorical question in 
which the client is actually ask- 
ing for information. If a ques- 
tion is only part of a complete 
response, the question is ig- 
nored in the classification. 


Some Problems Involved in the Method.—Find- 
ing units in oral material is no less difficult than 
finding the units of any psychological event. The 
arbitrary definition of the unit in this study as 
all words spoken by the client between two ut- 
terances of the counselor provides objectivity and 
avoids the fractionation which so often obscures 
meaning. It is possible that some of the client 
responses treated as units may be fragments of 
units or may be multiple units. While the unit 
of self-reference may be patently nonunitary in 
some instances, it is defined objectively and repe- 
tition of the procedure becomes possible. 

A further problem is found in the assumptions 
underlying the use of a method which simply 
summates the responses in each category for a 
complete interview. Why should one expect to 
obtain an accurate indication of a client’s atti- 
tude toward himself by such a procedure? Logic 
would predict that certain factors in the inter- 
view situation such as reactions to the counselor, 
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defensiveness, etc. would result in distortions of 
the true attitudes. It is believed that distorting 
factors are present and do operate particularly 
during early stages of contact. It needs little 
imagination to envisage the clumsy or ardent in- 
terviewer forcing a client into defensive or angry 
self-description. When, however, the counselor 
aims skillfully at obtaining free expression by the 
client, there is reason to suspect that the resulting 
responses will bear a considerable resemblance to 
the picture which the client perceives. The use 
of simple summation as a method of analysis is 
crude and liable to a variety of distorting factors. 
The problem thus becomes one of validity. One 
can theorize that an individual will reveal himself 
by the quantity of his self-references as well as 
in the quality of any selected individual response. 
Perhaps the frequency with which a person re- 
turns to a given topic indicates the concern or 
importance that topic has for him. Concretely, 
if a person thoroughly approves of himself we 
should expect to find very few if any statements 
of self-depreciation in free conversation. (Stylis- 
tic phrases intended to insure modesty are not 
difficult to identify.) Present day measures of 
personality, projective or nonprojective, utilize 
such a frequency principle in one form or an- 
other. 

The Reliability and Objectivity of the Method. 
—Two determinations of reliability were made. 
The writer classified two complete cases totalling 
874 client responses in 13 verbatim interviews.? 
Six months later he reclassified the same material 
without referring to the earlier classification. A 
frequency count of the number of times the same 
response was reclassified in the same category re- 
vealed 80.8% identical classification for the 874 
client responses. The results of similar analysis 
by particular categories are shown in Table 1. 
Thus of the three significant categories, positive 
and negative self-references were identified with 
much more certainty than were ambivalent re- 
sponses. 

An interview-by-interview chi-square analysis 
of the differences between the two classifications 
by the writer indicated that probably all such dif- 
ferences in category totals could be attributed to 
chance factors. Chi-square values for the differ- 
ences were all less than those required at the .05 
level. 


2 Both cases have been published in toto and can 
be found in references (6) and (7). 


TABLE 1 


PER CENT OF AGREEMENT WHEN 874 RESPONSES 
WERE RECLASSIFIED AFTER A 6 MONTH 


INTERVAL. ONE JUDGE 
Category % Agreement 
P 72.9 
N 76.8 
Av 47.5 
A 61.9 
(0) 64.5 
Q 87.2 


The same method of studying reliability was 
used with the data obtained from four judges all 
of whom classified the same four selected inter- 
views. All judges were graduate students in clini- 
cal psychology who were given carefully detailed 
written instructions plus a one hour conference 
period before they independently classified all 
client responses in the four interviews. The four 
interviews were selected as follows: two were be- 
lieved easy and two were believed hard to classify; 
two were from successful cases, two from unsuc- 
cessful; three were from cases counseled some- 
what nondirectively, one from a case of very 
directive counseling; two were first interviews, one 
was a second contact, one was a fourth and 
penultimate interview; two of the interviews were 
conducted by the same counselor. A total of 356 
client responses were classified by the four judges. 

Analysis of the classification by the four judges 
revealed results very similar to those obtained by 
the writer with a six month interval between 
classifications. With agreement between three of 
the four judges as a criterion, 81.8% of all client 
responses were classified “correctly” by three of 
the four judges. As in the analysis of particular 
categories by the writer, the judges found P and 
N responses fairly easy to detect while Av was 
more difficult to detect with certainty. Table 2 
presents the percentage results for each category. 
Chi-square analysis of the differences between 
judges for each of the four interviews revealed 
that for one of the interviews chi-square had a 
value greater than that required at the .05 level 
indicating that there is much probability that for 
one of the four interviews the difference between 
judges is no accident. 

While complete agreement on classification of 
responses was neither expected nor obtained, the 
reliability studies indicate that the method can 
be applied to verbatim responses of clients with 
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TABLE 2 


PER CENT OF AGREEMENT (3 OF 4 JuDGES) WHEN 
356 CLIENT RESPONSES WERE CLASSIFIED 
By Four JUDGES 


Category % Agreement 
P 62.6 
N 82.0 
Av 50.5 
A 75.0 
(0) 63.7 
Q 81.0 


considerable hope of gaining an objective picture 
of changes taking place in verbalized self-refer- 
ences. 

Validity of the Method.—The results of two 
long cases were compared with the independent 
analyses of the same cases by their respective 
counselors. In his published account of the case 
of “Herbert Bryan,” Rogers (6) lists at the end 
of each verbatim interview “the outstanding atti- 
tudes which have been spontaneously expressed” 
by the client. He also gives brief subjective de- 
scriptions of what happened during the interview. 
The “outstanding attitudes expressed” apparently 
were selected without reference to such a notion 
as the Self-Concept or self-approving attitudes as 
defined in this study. His list of attitudes were 
classified according to the present method and 
the results compared with the results obtained by 
classifying the verbatim material for the eight in- 
terviews of the case. Some differences were 
found, but in general a high degree of correspond- 
ence was noted. Table 3 illustrates the corre- 


TABLE 3 


COMPARISON OF RESULTS OF APPLYING THE 
METHOD TO ROGERS’ ‘OUTSTANDING 
ATTITUDES” OF INTERVIEW 2 (BRYAN) 
WITH THE METHOD APPLIED TO 
VERBATIM INTERVIEW 


Rogers’ List Verbatim 
Total No. 
of Items 15 43 
%P 20 16.2 
AN 73.5 69.8 
Av 6.5 14.0 
Total % 100.0 100.0 


spondence found in a typical interview. Chi- 
square was applied to the differences obtained by 
the two methods for each interview. The Prob- 
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ability (from Fisher’s Table of P) was found to 
lie between .80 and .90 indicating that the dif- 
ferences were primarily due to chance. 

The second attempt to ascertain validity was 
a comparison of results for seven interviews of 
the case of “Alfred” with the results of a different 
method of interview analysis made by the coun- 
selor, Father Charles Curran, and reported com- 
pletely in his book (3). The results of this com- 
parison revealed that in the seven interviews ana- 
lyzed by both methods, where marked changes in 
an interview were revealed by one method, simi- 
lar marked shifts in results were obtained by the 
other method. 

When such attempts to seek validity for a 
method: of analysis are attempted, one can not 
expect to emerge with hard and fast conclusions. 
Nonetheless, the results of the comparisons are 
sufficiently encouraging to indicate that a quan- 
titative analysis of client statements contains dis- 
tinct possibilities for unraveling the complexities 
of basic personality dimensions. Curran’s (3) 
intensive analysis of a single counseling case and 
Baldwin’s (2) approach to the same problem 
using personal documents reveal the tedious but 
fruitful possibilities of the quantitative approach 
to verbatim verbal output. 

Application of the Method to 14 Cases.—In or- 
der to explore the possibilities of the method, all 
responses in all interviews of 14 selected and com- 
pleted counseling cases were classified by the 
writer. The clients were primarily referrals to a 
psychological clinic conducted for college students. 
The number of interviews ranged between 2 and 
21 per case, the median being 7 interviews. The 
criteria of selection were as follows: 

1. All should involve primarily the counseling 
of students with personal problems rather than 
cases originating in vocational or educational prob- 
lems. 

2. Different counselors should be represented. 
(There were 11 different counselors.) 

3. Some should represent nondirective, others 
should represent directive counseling. (This cri- 
terion was met as only three were nondirective, 
one was clearly directive and the remainder repre- 
sented blends of the two methods.) 

4. Unsuccessful as well as successful cases 
should be included. (See below.) 

5. Verbatim recordings should be available for 
all interviews of all cases. (This criterion was 
approached closely although 24 of the 111 inter- 
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views were recorded by counselor notes, some of 
which were condensations with an unknown num- 
ber of client words omitted.) 

6. Each case should contain all interviews held 
between first and last contact. (Only one inter- 
view in a case of 8 contacts was unobtainable. 
One half of another interview was lacking be- 
cause of recording difficulties.) 

The judgment of “success of the counseling” 
was made prior to the development of the method 
of analysis and was based upon clinical appraisal 
by the counselor involved, the professor of clini- 
cal psychology under whose direction most of the 
cases had been counseled, and the writer. Some 
follow-up data were available for four of the group 
judged successful and for three of the group 
judged unsuccessful. The criteria of success in- 
cluded reports of actually changed behavior and 
social relationships where follow-up was available 
as well as client expressions of satisfaction with 
the counseling at the conclusion of contact. In- 
cluded in the successful group were several clients 
who referred friends or relatives to the same coun- 
selor although this was not used as a criterion of 
successful counseling. For the unsuccessful group, 
judgment was based upon follow-up data showing 
presence of the same problem, failure to return 
for another interview and obvious dissatisfaction 
with the progress of counseling during the final 
interview. Two cases represented questionable 
successes leaving five unsuccessful cases and seven 
judged successful. 

Results of the Method Applied to 14 Cases.— 
In view of the small number of cases available 
for analysis, conclusions resulting from the appli- 
cation of the method must be regarded as tenta- 
tive. Elaborate statistics were avoided since there 
were only five unsuccessful cases and seven clearly 
successful cases. Nonetheless, all results obtained 
seemed to fit the original hypothesis that in suc- 
cessful counseling cases there is a shift in self- 
evaluation from an original preponderance of dis- 
approval to a preponderance of self-approval at 
the end of counseling. In unsuccessful cases such 
a shift was not found. 

One of the most striking indications of this find- 
ing is obtained by examining the graphs con- 
structed for each case based upon the interview 
by interview plotting of the relative frequency 
of the responses classified in the P, N, and Av 
Categories. Percentages rather than raw scores 
Were used in order to eliminate fluctuations due 
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to the total number of responses varying from 
interview to interview. Only two illustrative 
graphs are presented here (Fig. 1). 

The principal characteristics of successful and 
unsuccessful cases are exemplified in Fig. 1. The 
progression of Per cent P shows a decline from 
the first to the second interview in both successful 
and unsuccessful cases, while Per cent N tends 
to rise in both groups. This inverse relationship 
can probably be explained in terms of the dropping 
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seling, based on P, N, and Av percentages of all 
client responses. 


away of some of the client’s initial defensiveness. 
As contacts continue, there are wide fluctuations 
in both Per cent P and Per cent N with P tending 
to rise and N to decline. Per cent of ambivalence 
rises for both groups (probably indicating in- 
creased uncertainty) but disappears, in the suc- 
cessful cases only, near the end of counseling. 
Concluding contacts show P rising to almost 100 
per cent in the final interview in the success group, 
while N declines to the vanishing point. In the 
unsuccessful cases P never exceeds in any inter- 
view a combined N and Av. In accordance with 
expectations aroused by inspection of the graphs 
for all cases, a comparison of the gross frequency 
of each of the six categories revealed that the P 
category was the only one which distinguished be- 
tween successful and unsuccessful cases. 

Gross frequency, however, was computed on 
the basis of all interviews combined which pro- 
hibits analysis of the changes occurring from be- 
ginning to end of counseling. In order to exam- 
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ine temporal changes, straight line curves were 
constructed from the raw scores of the P, N and 
Av categories using the Method of Least Squares. 
It was not supposed that Self-Concept reorgani- 
zation could be accurately represented by means 
of a straight line derived for the purpose of mini- 
mizing inter-interview fluctuations. Nonetheless, 
an inspection of these graphs does show that such 
straight lines do differentiate between the suc- 
cessful and unsuccessful cases. In all but one of 
the seven successful cases the line of P rises to 
intersect and emerge at the final contact above 
the lines of N and Av. The one case in which P 
did not cross above N had 21 interviews in which 
the personal problems seemed to be solved at the 
11th interview and the counseling from then on 
resembled a student-faculty member relationship. 
In none of the 5 unsuccessful cases did the straight 
line of P cross above or even rise to meet the N 
and Av lines. Despite the apparent differences 
between the two groups, no statistical reliability 
for the slopes of any curve in either group was 
found when Fisher’s “t” was applied as a test of 
significance and .05 was made the fiducial limit. 
A trend toward reliability was found in the suc- 
cessful cases which was not approached by the 
unsuccessful group. Thus we are dealing with 
trends and since there are few grounds for believ- 
ing that the change in self-approval can be repre- 
sented by a straight line function the statistical 
measures used may obscure rather than clarify the 
facts. 

A further study was made of longitudinal 
changes in each case by cumulating raw scores of 
P and dividing by cumulated N at the end of each 
interview. Inspection of these quotients after 
graphing them revealed that the .50 level was 
reached by all of the successful cases by the end 
of counseling. (One case in this group had a final 
ratio of .49.) Of the unsuccessful groups, one 
case showed an initial ratio starting above the .50 
level, never dropping to that level in four inter- 
views. This client, interestingly enough, was the 
only one in the group of fourteen who expressed 
any resentment against initiating counseling and 
was the only one who would be considered a non- 
voluntary referral. Of the other four unsuccess- 
ful cases, only one had a ratio as high as .33 at any 
time, The finding that in successful cases cumu- 
lative P divided by cumulative N reaches the .50 
level by the end of counseling (which is not ap- 
proached by any of the unsuccessful cases) seems 
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to be artifactual insofar as its usefulness for pre- 
diction is concerned. There is no plausible rea- 
son to believe that before a case is successfully 
concluded the client must express at least half as 
many positive self-references as negative self-refer- 
ences. Nonetheless, the technique of the cumu- 
lative ratio opens interesting possibilities for fur- 
ther analysis of data obtained by this method. 

Another index, called the “Av ratio,” was com- 
puted by finding the percentage which the Am- 
bivalent category forms of the total of the three 
significant categories, P, N and Av. Analysis of 
this ratio for each case indicates that it may dif- 
ferentiate between clients who are seriously con- 
cerned about themselves and enter counseling un- 
der considerable tension from clients manifesting 
less personal disturbance. The hypothesis was 
generally upheld when the cases were ranked sub- 
jectively but there were inconsistencies in the mid- 
dle range. 


DISCUSSION 


The principal findings can be summarized very 
simply. At the beginning of counseling the clients 
disapproved of and had ambivalent attitudes to- 
ward themselves. As counseling progressed fluc- 
tuations in approval occurred with mounting am- 
bivalence. At the conclusion of counseling the 
successful cases showed a vast predominance of 
self-approval: the unsuccessful cases showed a 
predominance of self-disapproval and ambiva- 
lence. There is nothing startling about such con- 
clusions and it is doubtful if anyone experienced 
in counseling would seriously deny that consider- 
ably increased “self-respect” is observed in the 
client who has been successfully “treated.” The 
significance of the finding that changes in “self- 
respect” can be objectively followed during coun- 
seling becomies a crucial matter. A devotee of 
“dynamic tensions” might be justified in regarding 
the present findings as superfluous if not amusing. 
Interpretation thus becomes a matter of relating 
the findings to one’s frame of reference where 
personality organization is concerned. The cru- 
cial question becomes, When dealing with self- 
evaluations as defined in this study are we dealing 
with epiphenomena or with indicators of more fun- 
damental changes? 

Psychological events are rarely considered to 
be epiphenomenal. Even the strict behaviorist of 
twenty years ago rejected the data of immediate 
experience because of its unreliability rather than 
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because of its nonexistence. The issue is one of 
the ultimate significance which can legitimately be 
inferred from subjective report for the understand- 
ing of the behavior in question. The findings of 
the present study can be regarded as contributing 
only fragmentary or biased insight into person- 
ality changes but the findings can not be dismissed 
as “mere superficial descriptions of feeling.” It 
would indeed be a sorry system of psychodynamics 
which dismissed “feelings” (which are probably 
self-judgments, for the most part) from the realm 
of psychological significance. 

The conclusion which can be drawn from the 
present findings, and the interpretation which 
would be in line with self-concept theory, is es- 
sentially the hypothesis that changes in self-ap- 
proval are indicative of changes taking place in 
personality although the observed changes may 
not be direct measures of the fundamental changes 
themselves. Violent fluctuations in self-approval 
such as those occurring at the onset of a frank 
psychotic episode have not been studied quanti- 
tatively and need not contradict the present hy- 
pothesis in view of their bizarre setting. In some 
instances they probably corroborate the general 
hypothesis. 

It is believed that intimate knowledge of self- 
Concept, may well be one of the major pathways 
perceptions or their organized totality, the Self- 
to detecting changes in personality if objective 
methods of study can be developed. The self- 
perceptions may not reveal the fundamental na- 
ture of the changes taking place nor the exact 
means by which the changes occur. An analogy 
might be drawn to the use of the thermometer for 
observing changes in temperature. The change 
in elevation of the column of mercury gives no 
direct knowledge to the observer of the altered 
velocities of the molecules surrounding the ther- 
mometer nor of the conditions which produce the 
altered velocities. Yet the thermometer can be 
used as a stable probe body for studying the con- 
ditions under which changes in molecular velocity 
can be brought about. 

The fact that the present method of following 
changes in self-approval is relatively reliable and 
Objective lends further hope for its usefulness as 
a probe body. Much work remains to be done be- 
fore its ultimate usefulness can be determined. 
Yet it is suspected by the writer and to some de- 
gree confirmed by the present study that coun- 
selors or psychotherapists may well have been 


using a similar technique on an intuitive level in 
diagnosing and judging the progress of their cli- 
ents and patients. 

A final question. Can not the present research 
be viewed simply as indicating changes occurring 
in attitudes towards oneself without dragging in 
the Self-Concept? Perhaps. The answer depends 
upon how one views attitudes. As atomistically 
conceived, attitudes may be tabulated as entities 
in themselves. Present theory seems to demand, 
however, a more complex and higher order con- 
struct to account for the functional interrelation- 
ships of attitudes. This demand accounts, in part, 
for the inclusion of the Self-Concept as an ex- 
planatory principle in personality organization. 


SUMMARY 


In the analysis of personality changes occurring 
in counseling and psychotherapy it is necessary to 
find constructs which can be investigated with the 
usual safeguards of objectivity and reliability. 
The events occurring in the interview provide a 
logical focus for such research. The verbal as- 
pects of the interview are mainly concerned with 
the client’s discussion of himself. It is postu- 
lated that a person’s Self-Concept is a significant 
factor in his behavior and personality organiza- 
tion. By measuring changes which occur in cli- 
ents’ attitudes toward themselves, it is believed 
that changes in Self-Concept and therefore in per- 
sonality organization can be detected. 

The present study used a six category check 
list to classify all client statements in terms of self- 
reference with three significant categories relating 
to approval, disapproval and ambivalence. Ob- 
jectivity and reliability of the categories were de- 
termined statistically. Validity was investigated 
by comparing results of the method with inde- 
pendent analyses of two counseling cases by two 
different methods of analysis. 

When the present method was applied to four- 
teen counseling cases for which there were al- 
most verbatim recordings of the 111 interviews, 
consistent differences were discovered between 
cases judged to have been counseled successfully 
and those resulting unsuccessfully. In the suc- 
cessful cases there was a marked shift from a 
preponderance of self-disapproval and ambiva- 
lence at the beginning of counseling to a strong 
emphasis on self-approval at the conclusion of 
contact. This shift in self-evaluation was not 
found in unsuccessfully counseled clients. The 
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results are interpreted as being in accord with the 
hypothesis that successful counseling involves es- 
sentially a change in the client’s Self-Concept. 
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PROBLEMS OF CONTROLS IN PSYCHO- 
THERAPY AS EXEMPLIFIED BY THE 
PSYCHOTHERAPY RESEARCH PROJECT 
OF THE PHIPPS PSYCHIATRIC CLINIC * 

JEROME D. FRANK 


In the broadest sense, the purpose of controls 
is to answer the question: how sure are you that 
you really know what you think you know? 
Problems of control arise only after a researcher 
thinks he knows something—that is, after he has 
an hypothesis that certain variables are related in 
a certain way—and he wishes to determine 
whether he is right. The purpose of controls, in 
other words, is to exclude alternative hypotheses. 


* Reprinted with permission from the American 
Psychological Association, Inc. In Eli A. Rubinstein 
and Morris B. Parloff (Eds.), Research in Psycho- 
therapy, 1959, 10-26. 
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The level of certainty at which the truth or falsity 
of an hypothesis can be established is a function 
of the accuracy with which the relevant variables 
can be identified, measured and manipulated. 
Therefore the degree of possible and desirable 
control in a particular field of study depends on 
its state of development. In its pre-scientific stage 
important insights may be achieved without the 
use of any controls worthy of the name. Even 
at this level, however, since the researcher only 
explores regions where he expects to find some- 
thing, he is being guided by implicit hypotheses, 
and the use of crude controls may facilitate his 
search. Darwin, for example, used a kind of 
control when he made a special point of jotting 
down observed phenomena which seemed to re- 
fute his tentative hypotheses. 

Research in psychotherapy attempts to set up 
and test hypotheses subsumed under the general 
question: what kinds of therapist activity produce 
what kinds of change in what kinds of patient. 
That is, the independent variables lie in the pa- 
tient’s state before the therapist’s intervention and 
in the therapist's activity, the dependent variables 
in changes in the patient’s feelings and behavior. 
Since few of these variables are as yet adequately 
defined and the researcher can directly observe 
or manipulate only a few of those which are im- 
portant, it is obvious that the field of psycho- 
therapeutic research is still at a relatively primi- 
tive level. It is, however, possible to design stud- 
ies which are controlled at least to some extent in 
that they permit planned, though crude, manipu- 
lations of certain variables and randomization of 
others. 

For purposes of discussion of controls in psy- 
chotherapy, the division suggested by Edwards 
and Cronbach (6) into person variables, situation 
variables, and response variables seems especially 
convenient. The personal attributes of the thera- 
pist are situational variables from the patient’s 
standpoint, and are therefore considered under 
this heading. z 

In order to give focus to the discussion, I shall 
draw chiefly from a study of psychotherapy with 
psychiatric out-patients carried out at the Johns 
Hopkins Hospital, in which we have barked our 
shins against most of the major problems of con- 
trol. The project took its origin from the lack 


1 These studies were supported by the Veterans 
Administration and by a research grant from the 
National Institute of Mental Health, National Insti- 


DEVIANT BEHAVIOR AND ITS TREATMENT 


of demonstrable difference in improvement rates 
reported by proponents of different therapies (2), 
suggesting the likelihood that all forms of therapy, 
including some which are not called psychother- 
apy, have much in common. The task we set 
ourselves was to try to identify attributes of pa- 
tients determining their responsiveness to these 
common features of psychotherapy. This re- 
quired the use of more than one therapist and 
type of therapy and a design which would make 
it possible to determine if any attributes of pa- 
tients were related to improvement regardless of 
therapist or therapy. In addition, the design per- 
mitted analysis of the data to discover possible 
specific contributions of different therapists and 
therapies to the obtained results.? 

The patients were selected for the project on 
their initial visits to the out-patient department of 
the Johns Hopkins Hospital. They were white, 
aged 18-55, and of both sexes. Only those with 
organic brain disease, antisocial character disor- 
ders, alcoholism, overt psychosis, or mental defi- 
ciency were excluded. They were further char- 
acterized initially by the usual clinical diagnostic 
categories, by an inventory covering various as- 
pects of their attitudes and behavior deemed rele- 
vant to therapy, and by initial scores on the scales 
used to measure change, described below. 

With respect to the situational variables, the 
therapists were three members of the psychiatric 
resident staff in the second year of training. Each 
had done considerable individual therapy and had 
conducted one therapeutic group under supervi- 
sion. Three forms of therapy were used: group, 
individual, and “minimal.” Group and individual 
therapy were guided by the therapeutic philosophy 
of the Phipps Clinic. In general, the therapist's 
aim is to establish a relationship with the patient 
Which will help him identify and correct current 
distortions in his interpersonal perceptions and 
behavior. in individual therapy this implies rela- 
tively greater emphasis on the present than the 
past; in group therapy, emphasis on group inter- 


tutes of Health, United States Public Health Service. 
The research staff consisted of Lester H. Gliedman, 
M.D., Stanley D. Imber, Ph.D., Earl H. Nash, M.S., 
and Anthony R. Stone, M.S., S.W. in addition to 
the writer. We wish to express our grateful acknowl- 
edgment to Dr. Morris B. Parloff for his crucial con- 
tributions to the planning and early phases of the 
project. 

?For a more detailed account of the design, see 
Frank, Gliedman, Imber, Nash, and Stone (8). 
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actions rather than on events transpiring outside. 
Examination of historical origins of these dis- 
tortions is seen as a means of clarifying them, 
when appropriate, not as an end in itself. Pa- 
tients received group therapy one and one-half 
hours once a week, individual therapy one hour 
a week, Minimal therapy consisted of a brief 
infrequent interview, not more than one-half hour 
every two weeks, focused on the patient’s com- 
plaints and how he might best deal with them. 
The reasons for the choice of these three forms of 
treatment will be considered below. 

The response variables were changes in the pa- 
tient’s subjective discomfort and his social inef- 
fectiveness. These were chosen as representing 
the least common denominator of the aims of all 
the healing arts, including psychotherapy. How- 
ever else a patient may change under treatment, 
unless he becomes more comfortable and more 
effective, it is hard to maintain that he has im- 
proved. Discomfort was defined in terms of forty- 
one symptoms or feelings which the patient re- 
ported as distressing. Ineffectiveness was defined 
in terms of fifteen types of behavior, generally 
recognized as socially ineffective, rated by inter- 
viewers on the basis of information obtained from 
the patient and an informant. We attempted to 
measure decrease in discomfort and ineffectiveness 
rather than increase in comfort and effectiveness, 
because it proved much easier to define degrees 
of malfunctioning than of successful functioning. 
It is easier to define illness than health. 

As to the research design, each psychiatrist 
conducted all three forms of treatment, and pa- 
tients were assigned at random to each by the 
research staff. Each psychiatrist treated 18 pa- 
tients, six in each of the three forms of treat- 
ment. Psychiatrists and patients were urged to 
remain in contact for at least six months, unless 
the psychiatrist felt that the patient had received 
maximum benefit before that time. 

Each patient, whether he stayed in treatment 
or not, was re-evaluated six months after enter- 
ing therapy (or when therapy terminated if this 
occurred between one and six months), again six 
months later, and at yearly intervals thereafter. 
In most cases a relative or close friend of each 
patient was interviewed separately at approxi- 
mately the same time as the patient. 

One of the major problems in attempting to do 
controlled studies in outpatient therapy is the dif- 
ficulty in carrying through an experimental design. 
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Because of the mobility of the American popula- 
tion, and the fact that psychotherapy competes 
with so many other activities in patients’ lives, 
attrition, missed appointments and the like create 
severe problems for maintaining any design which 
extends over time. Our design called for 54 pa- 
tients to receive six months of treatment by three 
therapists. By starting with 91 patients and con- 
ducting the treatments over about 18 months, we 
finally succeeded in obtaining 54 patients, of 
whom 37 (68%) completed at least six months 
of treatment and 48 (90%) had four months or 
more of treatment. By expending considerable 
effort we obtained follow-up interviews on 53 of 
the 54 treated patients at one year and 48 at two 
years. Since then attrition has been marked. 
With this example in mind, we may turn to 
consideration of some problems of control of 
patient, situational, and response variables. 


CONTROL OF PATIENT VARIABLES 


A problem which plagues all research with 
psychiatric patients is the adequate definition of 
the sample to be studied (25). "The criteria used 
ideally should be communicable with sufficient 
clarity and precision so that other workers by 
using them can duplicate the sample. At the 
same time they must be relevant to psychotherapy. 
None of the customary criteria are adequate in 
these respects. 

It is relatively easy to specify what may be 
termed actuarial characteristics of a research pop- 
ulation, such as age, sex, race and so on, so that 
others can duplicate the sample. The relevance 
of many of these characteristics, however, is ques- 
tionable. If to play safe, a large number of cri- 
teria are included, even though many are sus- 
pected of being irrelevant, it may be difficult to 
accumulate a sufficiently large research sample. 
On the other hand, if one bases selection of the 
sample on only a few criteria, one runs the danger 
of failing to include some that are relevant to ther- 
apy. For example, only recently has the impor- 
tance of specifying social class in studies of psy- 
chotherapy been appreciated (20, 37). 

Since characterization of a population in actu- 
arial terms, however complete, seems insufficient, 
other modes of description must be considered. 
Of these perhaps the most obvious is clinical 
diagnosis. Unfortunately, clinical diagnoses are 
based on rather vague and overlapping criteria, so 
that a patient’s proper diagnostic label is often in 
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doubt. In one study three well-trained psychia- 
trists observing patients jointly showed agree- 
ment as to the patient’s major diagnostic cate- 
gory in only 46% of the cases, and in only 20% 
with respect to the subcategory (3). An addi- 
tional difficulty with descriptive diagnostic cate- 
gories is that they are only loosely related to 
the major concerns of psychotherapy, the patient’s 
underlying conflicts and his characteristic inter- 
personal behavior (10). They may even prove to 
be completely irrelevant for purposes of psycho- 
therapy, although I do not share this view. 

The limitations of the conventional diagnostic 
scheme suggest directions in which to look for 
more useful diagnostic criteria. One would be 
in terms of the patients’ motivations and con- 
flicts. Unfortunately, characterization of dy- 
namics must be based on inferences, and these 
differ depending on the researchers’ theoretical 
preconceptions, so such criteria lose in commu- 
nicability what they gain in relevance. 

Another possibility is to borrow a notion from 
drug studies and select patients in terms of the 
“target symptoms” which the particular form of 
psychotherapy hopes to modify (12), regardless 
of the clinical syndromes in which they occur. 
For example, one could study patients suffering 
from depression, anxiety, or visual hallucinations. 
Since these symptoms obviously can be expres- 
sions of various underlying states, however, the 
idea does not seem very promising in this form. 

If the’ major target symptoms of psychotherapy 
are considered to be disturbances in the patients’ 
characteristic ways of perceiving others and be- 
having towards them, however, then this approach 
may prove to be very fruitful. A sophisticated 
and promising example is the scheme of inter- 
personal dimensions of personality devised by 
Leary and his coworkers (27). 

Having selected the research population and 
described it as best we can, the next problem is 
how to divide it into experimental and control 
samples. There are four theoretically possible 
ways of doing this. Two exist only in fantasy, 
but should be mentioned for the sake of complete- 
ness. The perfect equivalent control group would 
consist of patients matched individually in all re- 
spects with those receiving psychotherapy. The 
matched patients would have identical heredity 
and life experiences up to the beginning of the 
experiment, an obvious impossibility. The second 
impossible method would be to match control and 
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experimental populations, patient by patient, on 
certain variables believed to be significant, such 
as age, sex, educational level and so on. Since 
each additional matching variable greatly increases 
the size of the population which must be screened, 
this approach is hopelessiy impractical in outpa- 
tient studies. 

The third possibility is to match by stratified 
sampling. Groups can be matched with respect 
to the proportions of patients in certain categories 
without matching individuals. Even this proce- 
dure is ordinarily too cumbersome for an out- 
patient department, because one cannot wait for 
a sufficient population of patients to accumulate 
before starting some form of treatment.* The 
final resort is to match by random selection; that 
is, by assigning patients alternately to treatment 
and control groups. The assumption is that over 
the long pull all significant variables will be ran- 
domly distributed among both groups, so that dif- 
ferences found in therapy and control groups at 
the end of the experiment may safely be attributed 
to the therapeutic procedure rather than to an 
unequal distribution of patient variables in the 
two groups. Whether randomization has been 
achieved can be checked by simple statistical 
methods applied to measurable attributes of the 
populations such as actuarial indices and test 
scores, 

All the methods of control of person variables 
described rest on the assumption that if the con- 
trol and experimental groups are matched on 
known variables, or if these variables are found 
to be randomly distributed throughout both 
groups, then all other variables which might po- 
tentially account for the differences found between 
control and experimental sample would be simi- 
larly distributed. This assumption may not be 


3 Two potential ways of increasing the size of the 
Population so as to make stratified sampling possible 
are by extending the project over time, or by drawing 
on the populations of several clinics as in the Veterans 
Administration Study (29). Each, however, intro- 
duces its own control problems. Extending the study 
Over time rests on the assumption that time of year 
is not a significant variable; i.e. that patients present- 
ing themselves for treatment at different times of 
year are essentially alike. This assumption may not 
be valid, as considered below. If patients are drawn 
from many clinics, this increases the number of vari- 
ables on which they must be matched, such as ethnic 
group, urban or rural, and so on, which tends to 
counteract the gain sought by increasing the size of 
the sample, 
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valid, and this has occasionally resulted in scien- 
tific tragedy. To take an example from another 
field, vast amounts of biochemical work on hos- 
pitalized schizophrenics have come to grief be- 
cause certain biochemical differences between the 
patients and matched nonhospitalized controls 
were attributed to schizophrenia, whereas they 
were really due to institutionalization, with its 
effects on activity, diet and so on. Recently high 
hopes were aroused by the discovery that the 
serum of schizophrenics oxidized adrenaline more 
rapidly than the serum of normal controls. The 
hopes were dashed when this phenomenon proved 
to be caused by vitamin C deficiency in the diets 
of the patients (7). When this variable was con- 
trolled the difference between normals and schizo- 
phrenics disappeared. In this example, an un- 
suspected but crucial difference in situational vari- 
ables invalidated the matching of person variables 
in control and experimental groups, leading to an 
erroneous explanation of the difference obtained, 

With respect to control of the patient variable 
in studies of outpatient psychotherapy, it should 
be pointed out finally, that though complete 
matching of control and experimental groups may 
be practically impossible to achieve, certain dif- 
ferences between the experimental and control 
groups need not destroy their usefulness for all 
purposes. In our study, in spite of every effort 
to assign patients randomly to the three types of 
therapy, we found that more lower-class patients 
were placed in group therapy. The sampling bias 
made it difficult to interpret a finding that more 
patients dropped out of group than individual 
treatment, but did not affect other findings, for 
example that in all forms of treatment those who 
scored sickest initially improved most. 

Another example of a biased control group in 
our study is that of patients who drop out of 
treatment within four sessions. These are self- 
selected and presumably differ systematically from 
those who remain in treatment, though the nature 
of the differences is unspecified. Nevertheless, 
the drop-outs served as a useful control of nega- 
tive findings. For example, they showed the same 
drop in discomfort scores after six months as those 
receiving various forms of treatment over this 
period of time (9). This finding permits the con- 
clusion that an improvement in discomfort is not 
a function of duration or type of psychotherapy 
received, or of differences in the nature of drop- 
outs and remainers. 
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In the Rogers and Dymond studies (15), the 
equivalent control group was unavoidably biased, 
in that it was selected from volunteers who were 
paid to participate in a “research on personality.” 
They did not perceive themselves as sick or need- 
ing help—an obviously important difference from 
the clients in the therapy group. Nevertheless 
they proved useful for certain types of controls. 

- An ingenious way of circumventing the whole 
matching problem is to use each patient as his 
own control. The patient is observed before and 
after a time interval in which he receives no 
therapy. He then receives therapy, after which 
the same observations are repeated (15). 
Changes between the first and second readings 
are compared with those between the second and 
third; presumably, differences would be due to 
the effects of therapy. Own-control designs do 
not escape practical and theoretical problems im- 
posed by withholding therapy, which will be con- 
sidered under situational variables. On the whole, 
when they can be managed, they probably repre- 
sent a better control than use of an equivalent 
group, except that they do not control changes in 
patients related to passage of time. For example, 
it has often been noted in a clinic that intake 
falls off during vacation periods (28). Whatever 
the reasons for this, it raises the possibility that 
the condition of patients may be affected by fac- 
tors connected with time of year. If an own- 
control experiment happened to be set up so that 
the no-therapy period was in the spring and the 
therapy occurred in the summer, this would leave 
open the possibility that differences in patient 
change in the no-therapy and therapy periods 
were attributable to the season. 

Obviously the proper selection of control groups 
in any study of psychotherapy is difficult. There 
may be severe practical restrictions on the re- 
searcher’s freedom to assign patients to therapy 
and control groups. He must be alert lest un- 
suspected bias creeps into these assignments, and 
must search for important overlooked variables in 
which the control and experimental groups are 
not adequately matched or randomized. Even 
inadequate selection methods are better than none, 
however, as a means of controlling for the patient 
variables in psychotherapy. 


CONTROL OF SITUATIONAL VARIABLES 


From the standpoint of situational controls, the 
first task is to control for the eventuality that 
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changes in patients observed in the course of a 
particular form of psychotherapy are not due to 
intercurrent life experiences or spontaneous fluc- 
tuations in the patient’s state. If this can be 
shown, the question still remains as to whether 
the changes are really attributable to the aspect 
of therapy which the researcher hypothesizes to 
be responsible for them. The chief problem of 
control in this respect appears to be to distinguish 
the effects of the therapist’s personality or attitude, 
from the effects of his techniques. 

Psychotherapy is only one of many influences 
which may produce changes in patients. For 
example, many rsychiatric conditions seem to fluc- 
tuate spontaneously or to be self-limited (39), 
and a patient is most apt to seek treatment when 
he is in a trough. The subsequent improvement 
may be due to the natural course of his condition. 
Treatment with younger persons may extend over 
a sufficient span so that processes of growth and 
maturation may contribute significantly to the 
changes observed. Improvement due primarily to 
extra-therapeutic occurrences, such as a change in 
job or social relationships, may be erroneously 
attributed to concomitant psychotherapy. The 
task of untangling the roles of therapy and life 
changes is further complicated by the fact that 
psychotherapy may have contributed to the pa- 
tient’s ability to make such changes. 

An obvious way of controlling for whether 
changes in patients are due to therapy or some- 
thing else is to compare them with changes in an 
equivalent group of patients who received no 
therapy. The no-therapy group has just been 
considered from the standpoint of control of pa- 
tient variables. We are now concerned with its 
use to control situational variables. With out- 
patients this presents formidable difficulties. The 
major problem is that an adequate no-therapy 
control would have to last the same length of time 
as therapy, to allow the same opportunity for the 
occurrence of significant extra-therapeutic events 
or spontaneous changes in both the control and 
treatment population. Since six months of ther- 
apy is usually considered to be the minimal re- 
quirement, a control group would have to g0 
without therapy for six months. Patients present 
themselves to the clinic because they are in distress 
and want something done. It is hard to reconcile 
telling them to wait this long with one’s profes- 
sional conscience. But this is the least of the 
obstacles. Keeping a sizable number of patients 
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without treatment for so long a period may have 
an adverse effect on the clinic’s community rela- 
tions. A possible source of a no-therapy control 
are patients on a waiting list for treatment, assum- 
ing that the clinic is so inefficient as to have a 
sufficiently large one. Experience has shown, 
however, that patients placed on waiting lists are 
apt to differ systematically from those who are 
taken promptly into treatment. Patients who 
seem more in need of help or who arouse the 
interest of the interviewer for other reasons tend 
to receive priority for treatment. 

Another difficulty is that most patients who are 
told to go away and come back in six months for 
re-evaluation will not do so. In order to avoid a 
monumental attrition rate, the clinic would need 
to maintain some kind of regular contact with the 
patients over this period; and any contact may 
contain therapeutic components (33), so that the 
no-therapy control would be violated by this pro- 
cedure. In addition, patients who are told that 
treatment at the clinic will not be available for 
some time, if they are in distress will inevitably 
seek treatment elsewhere, whether it be from a 
physician, faith healer, or corner druggist. By 
the same token, patients in psychotherapy will be 
less likely to seek out other sources of help. Thus, 
we would not really have a no-treatment control 
group as against a psychotherapy group but two 
groups, each receiving different kinds of treatment. 
Since the treatment received by the control group 
involves ingredients which may also be psycho- 
therapeutic, interpretation of differences in the 
results obtained in the two groups may be very 
difficult. An additional potential source of error 
lies in the fact that rejection of a patient for 
immediate treatment may affect his attitude in 
such a way as to influence his scores on self- 
administered tests of change, a matter discussed 
more fully below. 

The basic difficulty with the no-treatment con- 
trol is that withholding treatment after interview- 
ing a patient is, in a sense, a positive rejection of 
him. Psychotherapy is one form of interpersonal 
relationship, refusal of psychotherapy is another; 
it cannot be regarded as neutral. 

For many purposes a more promising control 
than withholding treatment entirely is to offer the 
control group a form of psychotherapy differing 
in an essential ingredient from that received by 
the experimental group (42). Since patients in 
both populations, from their own standpoint, 
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would be receiving therapy, this eliminates the 
problem of systematic differences in attitude 
towards the clinic and in the tendency to seek 
outside help. Since both types of treatment would 
be conducted for the same length of time, occur- 
rence of spontaneous fluctuations or important 
intercurrent life experiences would be randomized 
in the two populations, so any obtained differences 
in results could be safely attributed to differences 
in the therapy used. 

In this type of control, a major problem is to 
select and define the therapeutic ingredients in 
which experimental and control groups differ. 
Choice must be guided by two considerations: the 
probability that the variables are therapeutically 
significant and the ability to define them ade- 
quately. However important one may suspect a 
variable to be, its usefulness for research is limited 
by the precision with which it can be described. 
Wittenborn points out that one of the common 
failings of research in psychotherapy is failure to 
define the independent variable. This permits the 
investigator only to say, “how infrequently his 
result could be ascribed to chance but the reader 
is uncertain as to precisely what the result can be 
frequently ascribed” (43, p. 35). 

On the other hand, the researcher must not let 
himself be seduced into selecting aspects of thera- 
peutic technique for study mainly because they 
can be easily described. Too often such aspects 
prove to be therapeutically unimportant, and the 
result is a beautifully designed and reported ex- 
periment which fails to disprove the null hypothe- 
sis. Precision has been gained at the expense of 
significance.* 

In the Hopkins project we gained the impres- 
sion from pilot studies that one important differ- 
ence between therapies might be the amount of 
contact between psychiatrist and patient, so we 
incorporated this into our research design (19). 
Group and individual therapy were more nearly 
equated in amount of treatment contact than 
either was with minimal treatment. At the time 
of the first re-evaluation, patients in group and 
individual therapy had had approximately the 
same number ‘of sessions, 15.8 and 17.7 respec- 
tively, as compared to 9.3 sessions for patients 
in minimal treatment. Since group therapy and 
individual therapy differed in at least one major 
definable way, namely that in the former several 
patients are present simultaneously, in the latter 


4 See, for example, (38). 


350 


only one, the design also permitted determination 
of possible differences in effects of this variable. 
By our measure of social ineffectiveness, both 
group patients and individual therapy patients im- 
proved significantly more than minimally treated 
patients. This would seem to support the hy- 
pothesis that amount of treatment contact, 
whether in a group or individually, significantly 
affects improvement in social ineffectiveness, al- 
though an alternative hypothesis cannot be entirely 
ruled out, as noted below. 

The most important, and unfortunately the least 
understood, situational variable in psychotherapy 
is the therapist himself. His personality pervades 
any technique he may use, and because of the 
patient’s dependence on him for help, he may 
influence the patient through subtle cues of which 
he may not be aware. Dr. David Rioch tells an 
amusing example of a patient of his who was 
always depressed in the treatment interviews ex- 
cept on five occasions when he seemed quite bright 
and alert. This puzzled Dr. Rioch until he 
reviewed his notes and realized that on these five 
mornings, and on no others, he himself had taken 
benzedrine.® 

It is obvious that the therapist and therapy 
variables cannot be completely separated. It is 
unlikely that a therapist can conduct different 
types of treatment with precisely equal skill or 
that his attitudes towards them will be identical. 
Therefore, differences in results obtained by two 
forms of therapy conducted by the same therapist 
may be due to therapist rather than treatment 
variables, especially since the faith of a therapist 
in a form of treatment may account for much of 
its efficacy (7). In our psychotherapy study the 
psychiatrists disliked minimal treatment. They 
gave it reluctantly and felt that they were short- 
changing the patients. The patients remained 
just as long in this type of treatment as in the 
other two, suggesting that they were not as lack- 
ing in confidence in it as the doctors. It is pos- 
sible, however, that this difference in the doctors’ 
attitudes may have contributed to the finding that 
patients improved less in social ineffectiveness 
under the minimal treatment conditions. This 
example illustrates how difficult it is to control 
adequately for the influence of the therapist in the 
absence of fuller knowledge about the role of his 
personal attributes and attitudes in determining 
the outcome of treatment. Even though minimal 


5 Personal communication. 
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treatment was less effective than group or indi- 
vidual treatment in the hands of three different 
therapists, we cannot be sure that this was not 
due to differences in their attitudes to the different 
approaches, though we believe this to be unlikely. 

Evidence that the personal qualities of the psy- 
chiatrists were not irrelevant to the results of our 
study, however, is that one of the three showed 
a tendency, which did not reach acceptable sta- 
tistical significance, to obtain better results than 
the other two with all three forms of treatment 
by both criteria—discomfort and ineffectiveness. 
Unfortunately, our project was not designed to 
elucidate aspects of therapists’ personalities re- 
lated to their therapeutic success. 

It is clear that the achievement of better ability 
to identify and control therapist variables war- 
rants a high priority in psychotherapeutic re- 
search. Two studies which represent promising 
beginnings in this regard are that of M. B. Parloff, 
who showed that of two therapists of roughly 
similar and equal training, the one who was able 
to establish better social relationships also estab- 
lished better therapeutic relations (37), and the 
series of studies of Whitehorn and Betz, who 
found that psychiatric residents of similar training 
could be placed in two classes on the basis of their 
relative degree of success with schizophrenic pa- 
tients, and that these classes could be distinguished 
by certain patterns of scores on the Strong Inter- 
est Inventory (4). 


CONTROL OF RESPONSE VARIABLES 


Any attempt to consider control of the response 
variable in psychotherapy at once threatens to 
involve one in the tangle of questions as to what 
is meant by improvement in psychotherapy. For 
purposes of the present discussion I shall take the 
position, without attempting to defend it, that the 
aim of psychotherapy, as one of the healing arts, 
is to help the patient feel better and function 
better. The type of functioning which psycho- 
therapy tries to improve is social behavior in its 
broadest sense; that is, the patient’s ability to 
establish mutually satisfying relationships with 
others (32). 

Before turning to considerations of these cri- 
teria, however, it may be well to pause a moment 
on another set of response variables. These are 
changes in the patient’s behavior in the interview 
situation, including, for example, certain auto- 
nomic responses, the content of his verbalizations 
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(30, 34), and formal aspects of his verbal be- 
havior (35). Studies of changes in these vari- 
ables as functions of the activities of the inter- 
viewer in a single interview circumvent many of 
the problems of control of situational factors dis- 
cussed above. ‘They eliminate problems of the 
role of intercurrent life experience, or outside 
therapy, or spontaneous long-term fluctuations in 
the patient’s condition. Detailed studies of pa- 
tients’ responses in the interview situation are 
yielding much valuable information. The rele- 
vance of this information to psychotherapy, how- 
ever, depends on the establishment of its relation- 
ship to long-term improvement in the patients’ 
feelings and behavior, which is still far in the 
future. 

Returning now to what I propose to regard as 
the ultimate criteria of improvement, let us first 
consider control problems connected with evalua- 
tion of the patients’ social functioning. Since it 
is ordinarily impossible to observe the patient 
in situ, as it were, estimates of his social effective- 
ness must be inferences based on reports of pa- 
tients and other informants, though impressions 
gained from these can be supplemented, con- 
firmed, or called into question by observations of 
the patient in the treatment situation. Parenthet- 
ically, behavior of patients in therapeutic groups 
may be more useful for this purpose than their 
behavior in a private interview, since the group 
is closer to the interpersonal ‘situations of every- 
day life (40). 

The major control problem is how to minimize 
biases in the reports and in the observer. The 
former are best controlled by using at least one 
informant besides the patient. Presumably an- 
other informant will not have precisely the same 
attitude towards the patient and psychotherapy 
as the patient does. Comparing and contrasting 
the information from both sources should enable 
the raters to reach a more accurate evaluation 
of the actual state of affairs than relying on either 
alone. 

With respect to rater bias, since the ratings are 
based on interviews, it is not practically possible 
to conceal from the raters the kind of treatment 
the patient has had’ since the previous rating. 
This could conceivably be done by transcribing 
the interviews, having one set of persons edit out 
all clues as to what treatment the patient had 
received, a second set check to make sure this 
has been done, and a third set make the ratings 
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of change. In the present primitive state of our 
knowledge of psychotherapy, the enormous labor 
required to yield this level of control of rater bias 
can probably more profitably be expended in 
other ways. Moreover, it deprives the interviewer 
of face-to-face contact with the patient, and thus 
prevents his taking advantage of non-verbal cues, 
which may seriously handicap him. In our study 
we tried to guard against rater bias arising from 
knowledge of patients’ treatment in three ways. 
The first was simply to keep this possibility always 
in mind. The second was to base the final global 
ratings of social ineffectiveness on many sub- 
ratings of the patient’s behavior in different situa- 
tions, including the interview. Our social inef- 
fectiveness scale permitted ratings on a maximum 
of fifteen types of ineffective behavior and nine- 
teen categories of social situation, including the 
interview itself. Patients were only rated, of 
course, on the behavior and situations for which 
data were available. As a final safeguard, each 
interview was rated independently by the inter- 
viewer and a concealed observer, and the joint 
rating was arrived at by a conference between 
the four raters, two having rated the patient and 
two the other informant. 

The pros and cons of ratings arrived at by con- 
ference versus those arrived at by arithmetical 
combination of individual ratings are complex 
(18). In general, against the conference ratings 
is urged the danger that one person may unduly 
influence the total result, so that the conference 
would merely confirm the opinions of its most 
powerful member. We checked on this possibility 
in one of our studies by correlating ratings of 
individual conferees made before the conference 
with the conference rating. All correlations were 
within the same range, indicating that no one 
member dominated the group’s judgment (24). 

In favor of the conference method may be 
offered that it enables each participant to modify 
his impression in the light of information pre- 
sented by the others, so that the rating finally 
reached by the group should be better than that 
of any individual in it. On the other hand, they 
may hear more information than they can digest 
and evaluate, which may impede their ability to 
make a valid colléctive judgment (23). 

On balance we thought that conference ratings 
were probably more valid than those obtained by 
averaging individual ones. We did insist that 
each rater rate the patient before coming to the 
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conference to help him withstand the pressures 
of the other members. 

Turning to the other major criterion of im- 
provement, change in the patient’s feelings and 
attitudes, the only way of tapping these is through 
his reports, direct or indirect. If the patient can 
clearly perceive the significance of the informa- 
tion he gives, the question arises of controlling 
for factors influencing his statements other than 
his internal state. Indirect measures, which more 
or less conceal from the patient the significance 
of his responses, shift the control problem to the 
validity of their interpretation. 

In using scales which are relatively transparent 
to the patient such as a symptom check list or 
even Q-sorts which yield such measures as self- 
ideal discrepancy, one must always keep in mind 
the possibility that the patient is telling the rater; 
what he wants him to hear, so that changes in 
scores may be due more to changes in his attitude 
to the observer or to treatment than to genuine 
improvement. In Rogers and Dymond’s study, 
for example, the own-control clients who were 
placed on a sixty-day waiting period of no therapy 
were divided into two groups: the attrition group 
who stayed for less than six sessions of therapy 
subsequently and those who stayed for more than 
six sessions. On all measures the attrition group 
showed more improvement over the wait period 
than those who later accepted therapy. This is 
interpreted to mean that the attrition group 
showed more tendency to spontaneous recovery 
(16). Another possibility exists, which is that at 
the second testing the remainers wanted to show 
that they still felt the need for treatment; the 
attrition group, that they did not want further 
treatment. Thus the remainers would tend to 
indicate that they still felt sick, and the attrition 
group that they did not. Needless to say, such 
distortion need not be deliberate or conscious. 
This interpretation would be consistent, I believe, 
with the Rogers and Dymond findings on the two 
Q-sort measures (self-ideal relationship and ad- 
justment score), which do not fully disguise from 
the patient what the experimenter is looking for. 
I am not sure whether it could also account for 
the similar results on the TAT, but do not believe 
that even this highly indirect approach entirely 
excludes this possibility. 

We found in our study that at the start of each 
evaluation period patients who remained in treat- 
ment until the next re-evaluation had higher dis- 
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comfort scores on the average than those who 
dropped out before the next evaluation. This is 
similar to the results obtained by Rogers and 
Dymond, and suggests the same thing, namely 
that. scores on a discomfort scale are partly the 
patient's means of communicating that he wishes 
further treatment. 

Many indirect measures of the patient’s sub- 
jective state have been devised in order to cir- 
cumvent this type of problem. Projective tests 
such as the Rorschach or TAT permit measure- 
ments of changes in the patient’s attitudes through 
the use of communications, the significance of 
which is hidden from him. Since projective tests 
yield permanent records, the points at which they 
were given can easily be concealed from the rater, 
eliminating bias based on knowledge of whether 
or not the patient has had treatment. An in- 
escapable limitation of scores on projective tests 
is that they do not bear an obvious relationship 
to clinical improvement. Therefore, they must 
eventually be validated against other measures of 
patients’ subjective state and behavior. For ex- 
ample, in the Rogers and Dymond study, scoring 
the TAT one way gave results consistent with 
other measures but scoring it in a different way, 
though it yielded another set of relationships to 
the treatment variables, gave results that bore 
no relationship to the other measures of improve- 
ment (5, 17). 

Implicit in the discussion so far is that scores 
on measure of response variables may be influ- 
enced by the conditions under which they are 
measured, or by the measuring instrument itself. 
This problem, which exists even in the physical 
sciences, assumes major proportions in the evalua- 
tion of patients’ reports of improvement. As 
already indicated, these may be affected by the 
patient’s attitude towards therapy or the therapist. 
There remain to be considered the possible effects 
of the tester’s expectancies, of the form of the 
test, and of its repetition. 

With all measures except the strictly objective 
ones, and possibly even in these, the tester may 
influence the scores in accord with his expecta- 
tions through a process which may be analogous 
to operant conditioning (14).° The cues which 
mediate this influence may be so subtle as to 
escape the awareness of both interviewer and sub- 

6 Note a recent and relevant review of the litera- 


ture: Krasner, L. Studies of the conditioning of 
verbal behavior. Psychol. Bull., 1958, 55, 148-170. 
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ject. Although the limits of the effects of operant 
conditioning are not known, it certainly exerts 
more effect than has been generally realized. 
Examples are the way Freud’s patients fabricated 
infantile memories to conform with his theory 
of the etiology of neuroses (71 ), the recent work 
by Salzinger and Pisoni who showed that within 
ten minutes it was possible significantly to increase 
the number of affective statements made by 
schizophrenic patients (34), and Murray’s anal- 
yses of therapy protocols which demonstrated 
rapid shifts in frequency of certain content cate- 
gories of the patients’ productions in accordance 
with the therapists’ values, even when the latter 
thought he was non-directive (30). 4 

To complicate matters further, the patient may 
influence his own subjective state by hearing his 
report of it. If, in response to factors in the test 
situation, he says he feels better or worse than 
he “actually” does, his feelings may change to 
conform with his behavior, as William James ob- 
served long ago in a somewhat different context 
(22, p. 463). 

The form of the test may influence the patients’ 
scores. In our study the correlation between 
patients’ global estimates of improvement and 
their scores on the discomfort scale was only 0.65. 
Apparently it made a difference whether the meas- 
ure was itemized and written, or global and oral. 
An itemized scale perhaps makes the patient more 
cautious on the one hand and, on the other, re- 
minds him of complaints which had slipped his 
mind. 

The possible effects of mere repetition of the 
test must also be considered, especially those due 
to the patients’ greater unfamiliarity with the test 
and test situation on the first occasion than on 
subsequent ones (27). He may be puzzled by 
the test itself or made uneasy by other factors of 
the situation, which can adversely affect his scores. 
These influences are apt to be less strong the sec- 
ond time he takes the test, yielding “better” 
scores. We found that there was a marked aver- 
age drop in discomfort scores the first time the 
scale was readministered, and that on the average 
this drop was maintained over the following two 
years. The most probable explanation of this 
phenomenon would be that the first scores on the 
Discomfort Scale were artificially heightened by 
the patient's general uneasiness in an unfamiliar 
situation, so that we were measuring not so much 


the effect of six months of treatment as the effect 
of greater familiarity with the test and the situa- 
tion. It was possible to control for this by taking 
a group of patients at the two-year follow-up 
interval, giving them a placebo to take for two 
weeks, and then readministering the Discomfort 
Scale. We found a drop of the same order of 
magnitude between this fourth and fifth admin- 
istration of the scale, following the placebo, as 
there was between the first and the second, follow- 
ing psychotherapy (13). The drop in response 
to the placebo obviously could not be explained 
by increasing familiarity with the implement. 
This finding has implications for the relationship 
of psychotherapy to the placebo effect which are 
irrelevant here. In this context it is cited as an 
example of controlling for the effects of repetition 
of a test. 

Discussion of control of response variables in 
psychotherapy would be incomplete without men- 
tion of the importance of follow-up studies to 
determine long-term effects of treatment. Evalu- 
ation of any form of treatment is obviously inade- 
quate in the absence of information as to the 
duration of its effects. The longer the study con- 
tinues, the greater is the problem posed by attri- 
tion of the sample. This is influenced by patients’ 
attitudes towards the treatment received, and by 
the kind and amount of information sought by 
the investigator. The importance of the patients’ 
attitudes may be illustrated by the fact that after 
two years we were able to obtain re-interviews 
with 90% of the patients who had originally ac- 
cepted therapy, but on 33% of those who had 
dropped out of treatment. Presumably the feel- 
ing of the former group towards the Clinic was 
much more favorable. 

The effect of the amount of information desired 
is suggested by the fact that our most strenuous 
efforts brought back only 56% of the treated 
patients for personal re-interviews after three 
years, whereas Saslow (36) reports a return of 
about 80% after four to six years, to written 
requests for limited information. 

Thus the conductor of a follow-up study is 
faced with a variety of choices as to how best to 
expend his resources. He must balance consider- 
ations of completeness of sample against the rela- 
tive value of various types of information ob- 
tained in different ways, and so on, but these 
questions need not concern us here. 
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In conclusion I should like briefly to review 
certain general considerations about controls, 
which seem to come up repeatedly in clinical re- 
search, The first concerns replication of findings. 
No findings, however striking, are more than ten- 
tative until they have been replicated. Replica- 
tion, incidentally, need not involve actual repeti- 
tion of the study. If the population is large 
enough, the same result can be achieved by divid- 
ing both experimental and control groups in half, 
and using one set as a test of the reliability of the 
findings obtained with the other. 

Replication with a fresh sample tests the ade- 
quacy of the description of the variables in the 
original study and the accuracy of the original 
observations, This is particularly important in a 
field like psychotherapy where so much is still 
unknown. Thus it is not surprising, though per- 
haps a little disconcerting, to find that repetitions 
of studies at the same clinics, with presumably 
similar populations and therapists, have failed to 
reproduce certain findings that possessed high sta- 
tistical reliability (29, 41). Failure of others to 
replicate a finding may lead the original re- 
searcher to discover that he had failed to make 
explicit an important experimental condition. At- 
tempted replication by others also helps to estab- 
lish the extent to which the original finding can 
be generalized to populations and settings differ- 
ing in various ways from the original ones. 

Perhaps the most important value of replication 
is that it guards against ex post facto reasoning. 
There is no limit to the ingenuity of the human 
mind. It seems to be literally impossible to pre- 
sent a person with a set of data that are so random 
that he will not be able to read a relationship into 
them.’ In psychotherapy if an experiment seems 
to demonstrate a certain relationship between 
therapeutic variables and changes in the patient, 
the experimenter can always make an hypothesis 
to explain it. This is a necessary and desirable 
first step to further research. A common error, 
however, is to offer the observed relationship as 
proof of the hypothesis. This circular reasoning 
can be escaped only by making an explicit predic- 
tion on the basis of the hypothesis and then seeing 
if the prediction is borne out with a fresh sample. 

While replication is ordinarily highly desirable, 
this is not to say that every finding should be 


7 Personal communication from Alex Bavelas. 
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replicated. Since, especially in research on ther- 
apy, replication may involve many months of 
work, and time and energy are limited, the inves- 
tigator must ask himself whether the tentative rela- 
tionship he thinks he has discovered is important 
enough to justify the effort of replication, or 
whether his time would not be better spent look- 
ing for more significant data. If he reaches the 
latter decision, is he justified in publishing the 
unreplicated finding? I believe he is, to make it 
available to others, as long as he does not present 
it as more than suggestive. 

A general point about controls, which is obvi- 
ous to statisticians but seems difficult for some 
clinicians to grasp, is that statistical methods of 
control can be applied to a relatively small sample. 
The size of the sample needed to achieve any 
given level of significance varies directly with the 
variability of the responses and the range of char- 
acteristics of the patients in the experimental and 
control groups, and varies inversely with the mag- 
nitude of the difference between the groups at the 
close of the experiment, assuming that they were 
matched at the beginning.® Also, it is possible 
validly to generalize findings obtained with a 
small population to a very large one, as demon- 
strated by public opinion polls. All that is re- 
quired is that the small sample be truly represen- 
tative of the larger one; that is, that important 
variables show the same relative frequency distri- 
butions in the two groups. Of course, the greater 
the discrepancy in size between the sample and 
the total populations, the greater the care needed 
to assure its representativeness. 

Statistical measures of significance may be mis- 
leading in that a statistically significant finding 
need not be significant in the non-technical sense 
of the term. The discovery of a very low corre- 
lation between variables which achieves high sig- 
nificance because the groups involved are large, 
indicates, to be sure, that some relationship is 
present, but it may be so weak as to contribute 
practically nothing to an understanding of the 
phenomena under study. The central question 
posed by such a finding is whether pursuit of the 
lead is likely to unearth a relationship of sufficient 
importance to justify the effort. An analogy 
might be the discovery of a low grade of ore. The 

8Kramer and Greenhouse (26) have prepared 
tables indicating how large experimental and control 
groups must be in order that given amounts of dif- 


ference between them will achieve given levels of 
significance, 
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decision as to whether to try to extract the metal 
from it would depend on the estimate of the work 
involved and the potential value of the metal. 
The Curies used tons of pitchblende to obtain a 
few grains of radium. They would not have 
made a similar effort to extract an equal amount 
of lead. 

Failure to demonstrate a significant relationship 
between two variables does not prove the absence 
of such a relationship. The statement that a 
proposition has not been proven to be true is not 
the same as the statement that it has been proven 
to be false. Some of my colleagues accuse me of 
being a therapeutic nihilist when I point out that 
it has not yet been demonstrated that different 
forms of therapy lead to significantly different 
results. They hear this as an assertion that no 
such differences exist, instead of an attempt to 
point out an area requiring research. 

Many factors may obscure the existence of a 
genuine relationship. Wittenborn (43) points out 
that if the sample is not normally distributed, this 
increases the standard error of estimate, decreas- 
ing the possibilities that the differences between 
experimental and control groups will meet a sta- 
tistical criterion for significance. This may lead 
to a real difference being overlooked. He sug- 
gests statistical measures for testing and correct- 
ing for this error. Significant differences may also 
be obscured by insensitivities or errors of the 
measuring instruments. In research on psycho- 
therapy, the therapist, through unfamiliarity with 
a new technique, may fail to get results, not be- 
cause the technique is valueless, but because he 
uses it poorly. “These are the factors our tender- 
minded, but not therefore unscientific, investi- 
gator bears in mind. He stresses that ‘not sta- 
tistically significant”—like the Scotch verdict ‘not 
proven’—permits us to return the hypothesis on 
trial to the arms of those who love it, rather than 
at once chopping off its head” (6, p. 57). 

Though controls enable the investigator to state 
the level of confidence of his finding, they do not 
insure a correct interpretation of it. That the 
serum of hospitalized schizophrenics oxidizes 
adrenaline more rapidly than the serum of non- 
hospitalized controls was established with a high 
level of certainty and replicated, but this did not 
prove that the difference was due to schizophrenia. 
It turhed out to be due to differences in the diet 
of the two populations. We can assert that the 
chances are better than 95 in 100 that patients 


will show more improvement in social effective- 
ness after six months of weekly individual therapy 
than after six months of minimal treatment, but 
the interpretation of this difference remains open. 
It would take another study to determine if the 
difference is best explained by the fact that pa- 
tients in minimal treatment had less therapeutic 
contacts, or by some other factor, for example 
that minimal treatment was devalued by therapists 
and patients. These examples illustrate that often 
one cannot control for a variable until one thinks 
of it. The automatic use of controls is no sub- 
stitute for thought. 

Since controls are always relative, and since 
energy expended in establishing controls is di- 
verted from that devoted to seeking new insights, 
their proper use is not solely a question of apply- 
ing the correct methods, but involves exercise of 
judgment concerning what level of control should 
be attempted. How much control to strive for is 
determined by the state of development of the 
subject and the potential importance of the finding 
to be checked. Sometimes it may be better to 
accept a poorly controlled tentative finding as a 
source of leads for potentially more penetrating 
or definitive studies, than to divert time and 
energy to trying to increase its level of confidence. 
Efforts to use a level of control not warranted by 
the state of the problem may be as hampering to 
good research as failure to use controls that are 
possible. 

Preoccupation with’ controls, moreover, is apt 
to guide the selection of questions for study, not 
by their significance, but by the ease with which 
they can be investigated. One is reminded of the 
familiar story of the drunkard who lost his keys 
in a dark alley but looked for them under the 
lamp post because the light was better. 

What is most needed in research on psycho- 
therapy is originality of thought and courage to 
grapple with important issues, setting up as much 
control as is feasible. Each experiment should 
Jead to another which is an improvement over its 
predecessor. In this sense a bad experiment is 
better than none, and several are better than one. 
Unless one makes the original crude experiments, 
no progress is possible. 


REFERENCES 


1. Angel, C., Leach, B. E., Martens, S., Cohen, 
M., and Heath, R. Serum oxidation tests in 


356 


10. 


11. 


12. 


13. 


14. 


. Edwards, A. L., and Cronbach, L. J. 


CONTEMPORARY RESEARCH IN PERSONALITY 


schizophrenic and normal subjects. Arch. 
Neurol. & Psychiat., 1957, 78, 500-504. 


. Appel, K. E., Lhamon, W. T., Myers, J. M., 


and Harvey, W. A. Long term psychother- 
apy. In Psychiatric treatment. Baltimore: 
Williams & Wilkins, 1953. Pp. 21-34. 


. Ash, P. The reliability of psychiatric diag- 


noses. J. abnorm. soc. Psychol., 1949, 44, 
272-276. 


. Betz, Barbara J., and Whitehorn, J. C. The 


relationship of the therapist to the outcome 
of therapy in schizophrenia. In N. S. Kline 
(Ed.), Psychiatric Research Reports 5. 
Washington: Amer. Psychiat. Assoc., 1956. 
Pp. 89-105. 


. Dymond, Rosalind F. Adjustment changes 


over therapy from thematic apperception test 
ratings. In C. R. Rogers and Rosalind F. 
Dymond (Eds.), Psychotherapy and person- 
ality change. Chicago: Univ. Chicago Press, 
1954. Pp. 109-120. 

Ex- 
perimental design for research in psychother- 
apy. J. clin. Psychol., 1952, 8, 51-59. 


. Frank, J. D. The dynamics of the psycho- 


therapeutic relationship, 1. determinants and 
effects of the therapist’s influence. Psychiat., 
in press. 

Frank, J. D., Gliedman, L. H., Imber, S. D., 
Nash, E. H., and Stone, A. R. Why patients 
leave psychotherapy. Arch. Neurol. & Psy- 
chiat., 1957, 77, 283-299. 


. Frank, J. D., Gliedman, L. H., Imber, S. D., 


Stone, A. R., and Nash, E. H. Patients’ ex- 
pectancies and relearning as factors determin- 
ing improvement in psychotherapy. Amer. 
J. Psychiat., in press. 

Frank, J. D., Margolin, J., Nash, Helen T., 
Stone, A. R., Varon, Edith, and Ascher, E. 
Two behavior patterns in therapeutic groups 
and their apparent motivation. Human Re- 
lat., 1952, 5, 289-317. 

Freud, S. A general introduction to psycho- 
analysis. New York: Horace Liveright, 1920, 
Freyhan, F. A. Psychomotility and parkin- 
sonism in treatment with neuroleptic drugs. 
Arch. Neurol. & Psychiat., 1957, 78, 465- 
472. 

Gliedman, L. H., Nash, E. H., Imber, S. D., 
Stone, A. R., and Frank, J. D. The reduc- 
tion of symptoms by pharmacologically inert 
substances and by short term psychotherapy. 
Arch. Neurol. & Psychiat., 1958, 79, 345- 
391% 

Greenspoon, J. The reinforcing effect of two 
spoken sounds on the frequency of two re- 


15. 


16. 


17. 


18. 


19, 


20. 


2: 


22. 


23. 


24. 


25. 


26. 


sponses. Amer. J. Psychol., 1955, 68, 409- 
416. 

Grummon, D. L. Design, procedures, and 
subjects for the first block. In C. R. Rogers 
and Rosalind F. Dymond (Eds.), Psycho- 
therapy and personality change, Chicago: 
Univ. Chicago Press, 1954. Pp. 35-52. 
Grummon, D. L. Personality changes as a 
function of time in persons motivated for 
therapy. In C. R. Rogers and Rosalind F. 
Dymond (Eds.), Psychotherapy and person- 
ality change. Chicago: Univ. Chicago Press, 
1954. Pp. 238-255. 

Grummon, D. L., and John, Eve S. Changes 
over client-centered therapy, evaluated on psy- 
choanalytically based thematic apperception 
test scales. In C. R. Rogers and Rosalind 
F. Dymond (Eds.), Psychotherapy and per- 
sonality change. Chicago: Univ. Chicago 
Press, 1954. Pp. 121-144. 

Hamburg, D. A., Sabshin, M. A., Board, F. 
A., Grinker, R. R., Korchin, S. J., Basowitz, 
H., Heath, H., and Persky, H. Classification 
and rating of emotional experiences, Arch. 
Neurol. & Psychiat., 1958, 79, 415—426. 
Imber, S. D., Frank, J. D., Nash, E. H., Stone, 
A. R., and Gliedman, L. H. Improvement 
and amount of therapeutic contact: an alter- 
native to the use of no-treatment controls in 
psychotherapy. J. consult. Psychol., 1957, 
21, 309-315. 

Imber, S. D., Nash, E. H., and Stone, A. R. 
Social class and duration of psychotherapy. 
J. clin. Psychol., 1955, 11, 281-284. 

Jacobs, A., and Leventer, S. Response to 
personality inventories with situational stress. 
J. abnorm. soc. Psychol., 1955, 51, 449-451. 
James, W. The principles of psychology. 
Vol. 2. New York: Holt, 1890. 

Kelly, E. L., and Fiske, D. W. The predic- 
tion of performance in clinical psychology. 
Ann Arbor: Univ. Michigan Press, 1951. 
Kelman, H. C., and Parloff, M. B. Interre- 
lations among three criteria of improvement 
in group therapy: comfort, effectiveness, and 
self-awareness. J. abnorm. soc. Psychol., 
1957, 54, 281-288. 

Kline, N. S., Tenney, A. M., Nicolaou, G. T., 
and Malzberg, B. The selection of psychia- 
tric patients for research. Amer. J. Psychiat., 
1953, 110, 179-185. 

Kramer, M., and Greenhouse, S. W. Deter- 
mination of sample size and selection of cases. 
Proceedings of the conference on the evalua- 
tion of pharmacotherapy, Washington, D. C., 
September 19-22, 1956, in press. 


27. 


28. 


29. 


30. 


31. 


32. 


33; 


34. 


AA 


36. 


oir. 


38. 


DEVIANT BEHAVIOR AND ITS TREATMENT 357 


Leary, T. Interpersonal Diagnosis of Person- 
ality. New York: Ronald Press, 1957. 
Lhamon, W. Time and rhythm in psycho- 
somatic relationships. In P. Hoch and J. 
Zubin (Eds.), Current problems in psychiatric 
diagnosis. New York: Grune & Stratton, 
1953. Pp. 244-255. 

Lorr, M. Progress and problems in research 
on psychotherapy. Paper read at VA-Univ. 
Conf., Univ. Maryland, November 14, 1957. 
Murray, E. M. A content-analysis method 
for studying psychotherapy. Psychol. Mon- 
ogr., 1956, 70, No. 13 (Whole No. 420). 
Parloff, M. B. Some factors affecting the 
quality of therapeutic relationships. J. ab- 
norm. soc. Psychol., 1956, 52, 5-10. 

Parloft, M. B., Kelman, H. C., and Frank, 
J. D. Comfort, effectiveness, and self-aware- 
ness as criteria of improvement in psycho- 
therapy. Amer. J. Psychiat., 1954, 111, 343- 
351. 

Rosenthal, D., and Frank, J. D. 
therapy and the placebo effect. 
Bull., 1956, 53, 294-302. 
Salzinger, K., and Pisoni, Stephanie. Rein- 
forcement of affect responses of schizo- 
phrenics during the clinical interview. J. ab- 
norm. soc. Psychol., 1958, 57, 84-90. 
Saslow, G., Matarazzo, J. D., Phillips, Jeanne 
S., and Matarazzo, Ruth G. Test-retest sta- 
bility of interaction patterns during interviews 
conducted one week apart. J. abnorm. soc. 
Psychol., 1957, 54, 295-302. 

Saslow, G., and Peters, Ann D. A follow-up 
study of “untreated” patients with various be- 
havior disorders. Psychiat. Quart., 1956, 30, 
283-302. 

Schaffer, L., and Myers, J. K. Psychotherapy 
and social stratification: an empirical study 
of practice in a psychiatric outpatient clinic. 
Psychiat., 1954, 17, 83-93. 

Semon, R. G., and Goldstein, N. The ef- 


Psycho- 
Psychol. 


fectiveness of group psychotherapy with 
chronic schizophrenic patients and an evalua- 
tion of different therapeutic methods. J. con- 
sult, Psychol., 1957, 21, 317-322. 

39. Shepherd, M., and Gruenberg, E. M. The 
age for neuroses. Milbank Memorial Fund 
Quart., 1957, 35, 258-265. 

40. Stone, A. R., Parloff, M. B., and Frank, J. D. 
The use of “diagnostic” groups in a group 
therapy program. Int. J. Group Psychother., 
1954, 4, 274-284. 

41. Sullivan, P. L., Miller, Christine, and Smelser, 
W. Factors in length of stay and progress 
in psychotherapy. J. consult. Psychol., 1958, 
22, 1-9. 

42. Watterson, D. J. Problems in evaluation of 
psychotherapy. Bull. Menninger Clin., 1954, 
18, 232-241. 

43, Wittenborn, J. R. Critique of small sample 
statistical methods in clinical psychology. J. 
clin. Psychol., 1952, 8, 34-37. 


GENERAL REFERENCES 


Arieti, S. American handbook of psychiatry. 
New York: Basic Books, 1959. 

Eysenck, H. J. Handbook of abnormal psychol- 
ogy. New York: Basic Books, 1960. 

Hollingshead, A., and Redlich, F. C. Social class 
and mental illness. New York: John Wiley, 
1958. 

Rogers, C. R., and Dymond, Rosalind F. (eds.) 
Psychotherapy and personality change. Chi- 
cago: University of Chicago Press, 1954. 

Rubinstein, E. A., and Parloff, M. B. (eds.) 
Research in psychotherapy. Washington, DG: 
American Psychological Association, 1959. 

Sarbin, T. R. (ed.) Studies in behavior pathol- 
ogy. New York: Holt, Rinehart and Winston, 
1961. 

Snyder, W. U. The psychotherapy relationship. 
New York: Macmillan, 1961. 


ra Na aaan u 
REN cer 
Kar 


SECTION IX 


The 
Case 


Study 


A correct inference to be drawn from this book 
is that research which has been carefully planned 
throughout and clearly executed is the best way 
to contribute to a science of personality. How- 
ever, occasionally psychologists in their day-to-day 
contact with persons observe, sometimes in poorly 
controlled settings and sometimes quite acciden- 
tally, behavior patterns so unusual and illuminat- 
ing that the necessity and usefulness of sharing 
their observations with interested colleagues be- 
comes obvious. Clinical observations, usually in 
the form of case studies and clinical reports, serve, 
among other functions (1) to provide leads for 
future well-controlled laboratory and field studies, 
(2) to shed light on relevant theoretical issues, 
and (3) to highlight the complexity of human 
behavior, its variability and its modifiability. 

The case of opposite speech in schizophrenia 
described in the articles by Laffal, Lenkoski and 
Ameen, and Laffal and Ameen, relates to these 
functions. This is also true for the dramatic case 
of multiple personality described by Thigpen and 
Cleckley. Both of these case studies provide the 
reader with relatively large amounts of informa- 
tion about particular very atypical individuals. 
This type of information, frequently in the form 
of raw data, can be of immense value to clinical 
workers as well as to researchers. 

The clinical paper by Fromm-Reichmann is in 
some respects different from the case studies pre- 
sented in the first three articles in this section. 
While referring to particular cases, Fi romm-Reich- 
mann’s major concern is to use these cases as 
illustrations of particular generalizations which 
have evolved out of her own clinical experience. 
Since the development of simple cures for mental 
disturbance is, at present, a distant goal, the trans- 
mission of clinical experiences and impressions is 
of great importance to the clinician-in-training. 
For such clinicians-in-training, Fromm-Reich- 
mann’s interpretations of her patient’s behavior 
may be of as much utility as the descriptions of 
their behavior per se. 

This section’s concluding article, while a case 
study report, is a different breed of case study 
than the ones presented earlier. Isaacs, Thomas 
and Goldiamond, while centering their attention 
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on the description of a particular form of psycho- 
pathology, have attempted also to integrate ex- 
perimental procedures into their approach to the 
patient. Using techniques originally employed in 
laboratory investigations of the behavior of hu- 
mans and animals, they successfully manipulated 
the behavior of the subjects of their report. Just 
as we have seen how investigators interested in 
more laboratory studies of learning and percep- 
tion have found personality variables of signifi- 
cance in their work, so the study of Issacs, 
Thomas and Goldiamond suggests how the stu- 
dent of personality and psychopathology may bor- 
row techniques originally developed by the experi- 
mental psychologist for use in his own clinical 
work. There is, it would seem, every reason to 
be optimistic about this sort of cross-fertilization 
of interest. 


“OPPOSITE SPEECH” IN A 
SCHIZOPHRENIC PATIENT * 


JuLius LAFFAL, L. DOUGLAS LENKOSKI, 
AND LANE AMEEN * 


The distorted speech of the schizophrenic has 
been described by White (72) and Woods (13) 
as the product of a regression characterized by 
reversion to a lower order of abstraction than 
normal adult language. Goldstein (6) also 
pointed out the relative concreteness of schizo- 
phrenic thought and language, and Woods (/3) 
further described schizophrenic language as “el- 
liptical.” Freud regarded language distortions in 
schizophrenia as consequences of the same proc- 
ess as in the dream, the primary process, in which 
condensation and displacement function freely 
(4, p. 437). One of the effects of the schizo- 
phrenic’s distorted speech is to make communi- 
cation between him and others extremely difficult, 


* Reprinted by permission from The Journal of 
Abnormal and Social Psychology, May, 1956, Vol. 52. 
No. 3, 409-413. j 

1 The authors are indebted to Dr. Norman Cam- 
eron of Yale University for his invaluable suggestions 
in preparing this paper. 
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and this difficulty appears to be what Sullivan had 
in mind when he wrote (JJ, p. 8): “Since lan- 
guage,is the most subtle and powerful lever that 
any culture provides, most linguistic operations of 
human beings in general, and all linguistic opera- 
tions of the schizophrenic, have to be oriented 
toward the pursuit of something quite impossible 
of attainment: a feeling of security in the pres- 
ence of strangers. As the schizophrenic, because 
of the very insecurity which has always charac- 
terized him, has tended even more to divorcement 
from these fellow men with whom he has never 
felt secure, the language operations at the height 
of the schizophrenic episode show most perfectly 
the sheerly magical operations which men effect 
by language.” 

In this paper we report an unusual language 
syndrome in a schizophrenic patient. The syn- 
drome, which we have called “opposite speech,” 
consists basically in the use of “yes” by the patient 
when he means “no,” and vice versa. We have 
not been able to delineate fully the extent to 
which this reversal has generalized in his lan- 
guage use, but we have been able to demonstrate 
that, besides the interchange of “yes” and “no,” 
it includes the interchange of “right” and “wrong,” 
the interchange of “do” and “don’t,” and occasion- 
ally the interchange of such opposites as “some- 
thing” and “nothing” and “like” and “hate.” The 
syndrome appears to be both receptive and ex- 
pressive, and it includes written as well as oral 
speech. While the documentation to be presented 
shows the opposite speech primarily in its expres- 
sive aspects, the receptive distortion has been 
noted in the patients misinterpretation of simple 
instructions and statements. The following is a 
brief description of the patient in whom the lan- 
guage reversal has occurred. Some insignificant 
details have been altered for the sake of anonym- 
ity. 

Peter is a twenty-four-year-old single man who 
first exhibited psychotic behavior while serving as 
a gunner with a mortar company in combat in 
Korea. At the time of admission to the field 
hospital he was mute, and subsequently he neither 
ate nor talked for three days. Information ac- 
companying him to the field hospital revealed that 
he had been hearing voices telling him he was 
dirty, that he spread disease, that he was an 
S.O.B. and a pervert. He was transferred to the 
United States one month later. At that time he 
ate very little because he felt that his food con- 
tained chopped up people and worms. He 
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equated worms with penises. After a total of 45 
electric shock treatments without sustained im- 
provement, he was transferred to a Veterans 
Administration hospital. There he was better 
oriented and in fair contact, and during the next 
few months, he was given several passes and 
leaves of absence and was finally placed on trial 
visit approximately fourteen months after his ini- 
tial breakdown on the battlefield. Three months 
after the beginning of his trial visit, he became 
quite agitated, telling his mother that one of the 
girls in the neighborhood was calling his name and 
would not leave him alone. He began to pound 
on the wall and shout that he wanted the name- 
calling to stop. It was at this point that the pa- 
tient was brought to his present Veterans Adminis- 
tration hospital. There was no mention in the 
fairly detailed Army and previous Veterans Ad- 
ministration hospitalization records of the oppo- 
site speech. 

The patient is an only child, His parents are 
frugal, lower middle class people, who appar- 
ently were always able to provide their son with 
things other boys had. They describe him as 
having been shy and introverted but always a 
good boy. He was an average student, being 
graduated from high school at the age of eight- 
een. Concerned about his slender physique, he 
sometimes worked enthusiastically with weights 
to build himself up. He was too bashful to go 
out with girls, and as far as can be ascertained, 
had no heterosexual relations. Following gradu- 
ation from high school, he worked in an uphol- 
stery shop until drafted into the Army. The fam- 
ily matrix includes an exceedingly protective, in- 
dulgent, and yet controlling mother, and a rela- 
tively ineffectual father. 

In asking the patient routine questions at ad- 
mission, the resident was struck by the confusing 
answers that he received. The patient appeared 
to substitute “yes” for “no,” and “always” for 
“never.” The patient’s parents first noted this 
type of reversal on the day prior to his admis- 
sion when the patient said at lunchtime, “I’m 
hungry.” His mother prepared a sandwich, and 
when she offered it to him he rejected it. In his 
behavior on the ward, the patient was withdrawn, 
uncommunicative, compliant, and well-behaved 
despite some agitated periods when he halluci- 
nated a marriage and the presence of his “wife” 
in the hospital, and when he hallucinated the 
presence of his mother on the ward. The hallu- 
cinatory and delusional content was mainly sexual 
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and religious in nature. Neurological studies were 
noncontributory. 

Psychological tests reflected the clinically patent 
thought disorder. Eight responses were given to 
the Rorschach, all of which were whole animal 
responses, and half of which were minus re- 
sponses. The Wechsler IQ was 71, with Verbal 
IQ 89 and Performance IQ 56. The breakdown 
in intellectual functioning was clearly due to the 
intrusion of psychotic material, and in light of 
the patient’s scholastic record as a high school 
graduate, his intellectual potential must be con- 
sidered higher than shown by the test. Word 
association and sentence completion tests demon- 
strated the opposite speech syndrome in some of 
the responses. Thus, to the stimulus limp he re- 
sponded, “Limping around, able to walk.” To 
disgust he responded, “Disgusted, when you're 
feeling good, happy and contented.” To stam- 
mer he responded, “Stutter. When the words 
come out easier and you have trouble.” On the 
Michigan Sentence Completion Test, orally ad- 
ministered, some of the completions were as fol- 
lows: What makes me angry is “if things go right, 
the weather is good, if I’m feeling good, and mad 
at myself.” When people make fun of me “I feel 
good, I get very nervous inside, I start to like.” 
Home “there’s any place like home. Home is the 
best cure.” I feel like cursing when “things go 
good, or if I have any hard luck, get in trouble in 
any way. Well, if things go right... . . If I have 
some plans and things come up to break it... 
don’t come up to break it.” J dream WORAN: 
sometimes of things that are... things that 
seem real, but they are real, but they don’t seem 
real . . . could be true, yet not fictitious. You 
dream you're a doctor when you are a doctor, 
when you're just an ordinary person.” She dis- 
liked him when he “was with her . . . against her 

‚not against her... making her happy, 
comfortable.” I feel like smashing things when 
“I'm calm.” 

In explaining proverbs, the patient’s opposite 
speech was apparent, and it was evident as well 
that the reversals were not consistent but inter- 
mittent and hence even more confusing. Thus 
when asked to explain the saying, When the cat 
is away the mice will play, the patient responded, 
“If the mice are in the presence of the cat they 
won't play. If the cat isn’t present while the 
mice aren’t there, of course the mice can’t play.” 
To Don’t cross your bridges until you come to 
them, the patient responded, “Why worry about 
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your bridges. In other words, when you're in 
front of something, worry about it. Before you 
meet it...” (Patient pauses; examiner encour- 
ages him.) “... If I got to meet somebody 
that’s my enemy ... Well, suppose I didn’t 
know it, and I don’t meet him, what in heaven's 
name am I going to do about it? Just confront 
him, even if I didn’t know I was going to meet 
him.” To Don't look a gift horse in the mouth, 
the patient responded, “I know what the hell a 
gift horse is.” 

Shortly after the patient’s admission the bur- 
den of his treatment as well as of the exploration 
of the speech syndrome fell to JL with LDL re- 
taining the role of co-therapist and assuming ad- 
ministrative and medical responsibility for the pa- 
tient. The emphasis in all contacts with the pa- 
tient has been upon treatment, and relatively little 
time with him has been devoted to exploration of 
the speech syndrome as such. These points are 
mentioned because the secondary reward aspect 
of the speech syndrome may be thought to be of 
importance in its persistence. Throughout the 
more than eight months of the current hospitali- 
zation, the opposite speech pattern has remained 
adamant, although it has not been consistently 
present or consistently used. The symptom also 
persisted under amytal, which, incidentally, had 
the effect of bringing out hallucinatory and de- 
lusional material about which the patient was 
ordinarily taciturn. Below are presented portions 
of two recorded interviews, the first without medi- 
cation and the second under amytal, which fur- 
ther demonstrate the opposite speech. 


Interview Without Medication 


Dr.: When did you come in the hospital? . . . 
When did you come in the hospital? . . . When? 

Pt.: I came in the hospital about five weeks 
ago, and when I’m going to leave I know, 

Dr.: When? You say you know when you're 
going to leave? 

Pt.: I do. 

Dr.: Can you tell me when? 

Pt.: When I’m going to leave I do know. Of 
course it’s not up to the V.A. when I’m leaving. 
It’s not up to the doctor, 

Dr.: Who is it up to? 

Pt.: It’s not up to my doctor. 

Dr.:,Who? Who is it up to? You say it’s not 
up to your doctor. Who is it up to? Why are 
you smiling? You said it’s not up to the doctor. 
Did you get your words mixed up? 
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Pts Yes. 

Dr.: What did you mean to say? You got your 
words mixed up. You said it’s not up to the doc- 
tor when you leave. 

Pt.: It’s not. 


Interview Under Amytal 


Dr. JL.: Peter, I want to ask you something. 
I'm holding a pipe here in my hand. Do you see 
this pipe? 

Pt.: No, 

Dr. JL.: Dr. L, do you see this pipe? 

Dr. LDL.: Yes, I see the pipe. 

Dr. JL.: Peter, how is it when I show you the 
pipe, you say, no you don’t see it, and Dr. L says, 
yes he does see it. How is it he says yes and you 
say no? 

Pt.: Well... The doctor says he don’t see 
it? 

Dr. LDL.: I do see it. 

Pt.: You do see it? 

Dr. JL.: What do you say, Peter? 

Pt.: I do see the pipe. 

Dr. JL.: You do see the pipe? 

Pt.: Unnnhhh. 

Dr. JL.: Now, wait, you tell me. I’ve got the 
pipe here in my hand. Do you see the pipe? 

Pt.: Nope, I don’t see it. 

Dr. JL.: Dr, L, do you see this pipe? 

Dr. LDL.: Yes, I do. 

Dr. JL.: Peter, how is it that Dr. L says yes 
and you say no, and I ask you both the same 
question. How is that? Can you explain that? 

Pt.: Well . . . What was that? 

Dr. JL.: How is it when I show you and Dr. 
L the pipe and I say do you see the pipe, you 
say, no, I don’t see it, and Dr. L says yes I do 
see it? You're both looking at it and you both 
give different answers. How is that? 

Pt.: Probably in his eyes he sees it some way 
different. 

Dr. LDL.: What’s in his hand, Peter? 

Pt.: A pipe. 

Dr. LDL.: What color is it? 

Pt.: It’s brown. 

Dr. LDL.: Do you think it’s brown? 

Pt.: No. 
~ In an effort to map out the extent of the re- 
versals, several series of questions were devised 
in which the patient was required to write his 
choice of a pair of opposites. In some of the 
series when the patient had responded with a 
written answer, he was queried orally on the same 
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items and his answers were noted. Below are fact that a fairly circumscribed set of habitual 
some of the items and the responses. The choice discriminations is involved makes one feel that 
pairs are indicated above the items. this reversal is the type of event that learning 
Yes-No Written answer Oral answer 
Iam a man. No Nope, I’m no man 
Snow is black. Yes Right 
Right-Wrong 
A midget is taller than a Wrong Should have put right 
giant. 
One and one make two. Wrong Wrong 
Like-Hate 
Cats milk. Like Like 
My mother _— me. Likes Hates 
Always-Never 
Sunday comes Always 
before Monday. 
We get wet when Never 


we go swimming. 


The opposite speech syndrome which has been 
described has occurred in the patient's conversa- 
tions with nurses, aides, other patients, and his 
parents. The self-isolating effect of the syndrome 
has been seen most vividly in group activities in 
which other patients, after receiving opposite an- 
swers, have given up attempts to communicate 
with Peter. With respect to the infiuence of the 
therapist-researcher on the perpetuation of the 
syndrome, there is evidence, as in some of the 
examples presented, that the patient is not con- 
scious of the reversal and that it is used in ordi- 
nary life contexts where the therapist is not pres- 
ent. We may suspect that some heightened aware- 
ness that he is speaking differently from others 
has been produced in the patient by the repeated 
queries of the doctors, but the denial of this dif- 
ference by the patient emphasizes the basically 
automatic feature of the syndrome. 


DISCUSSION 


Undoubtedly the yes-no discrimination is one 
of the best learned in language behavior, and re- 
versal of this discrimination is an event of mo- 
Mentous consequence for the individual.” The 


2 The changed relationship to the world of reality 
which is effected by this simple reversal may 
demonstrated for himself by the reader by trying sub 
Voce reversed interpretations of what others say an 


theories should help explain. Unfortunately, the 
learning theory analysis of language has not yet 
systematically integrated or even dealt with patho- 
logical language (2, 9, 10). Osgood (J0) has 
given a careful discussion of the relation of moti- 
vational states to language production, but makes 
little mention of pathological language. Typical 
predictions from a learning theory approach to 
language may be exemplified by the following 
quotations from Osgood (10, p. 165): “. . - the 
effects of increased generalized drive upon habit- 
family hierarchies will be to further augment the 
probability of the dominant response and rela- 
tively damp the probabilities of weaker reaction 
tendencies. This has important implications for 
language behavior under stress conditions.” An- 
other way in which drive may operate is by cue 
effects of the drive: “If the motive state of anger 
has been associated with swearing responses, sub- 
sequent occurrences of this motive state will in- 
crease the probability of such responses.” 

If we regard the psychotic patient as a person 
under stress, the first type of prediction would 
appear to call for a strengthening of the appro- 
priate yes and no responses. The second type of 
prediction, based on the cue effects of drives, ap- 


of what one wishes to say. It is not recommended 


that this test be made aloud! 
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pears to be capable of encompassing the type of 
speech syndrome seen in this patient, but only 
with considerable elaboration. 

Mowrer in his discussion of language has noted 
(9, p. 689): “how quickly we approach the un- 
known in psycholinguistics as soon as we depart 
from a circumscribed problem or conceptual 
model.” Our insecurity vis-a-vis pathological lan- 
guage may be in part due to the fact that the so- 
called pragmatic (8, p. 38) and interpersonal 
functions of language behavior (5) have been 
largely neglected by learning theorists in favor 
of the meaning or semantic aspects. The case 
described in the present paper, in which a funda- 
mental language habit is distorted, demonstrates 
the importance of language as a functional be- 
havior in dealing with the world and serving the 
needs of the patient. We must now attempt to 
understand what uses the opposite speech may 
serve in our patient. 

The schizophrenic negativism which Kraepelin 
(7) and Bleuler (J) have described at length, 
appears to be related to the symptom of this 
schizophrenic patient. His speech syndrome, how- 
ever, differs from typical schizophrenic negativism 
in two important respects. One, the patient is 
unaware of his reversals, and second, the reversals 
are confined to language and do not extend to 
the thought or action. Thus, if asked if he wishes 
a cigaret, the patient may say, “No,” instead of 
“Yes, I do,” but he accepts and smokes the ciga- 
ret. The patient’s speech produces confusion and 
dismay in the listener as does the schizophrenic’s 
negativism, but it does not have the same nega- 
tivistic connotations for the patient since he is 
unaware of his reversals. His syndrome avoids 
the open hostility which is so frequently an obvi- 
ous aspect of schizophrenic negativism, but it has 
a similar outcome of reducing communication 
with and effectively rejecting others. 

The Freudian interpretation of negation (3, 
p. 182) as“. . . a way of taking account of what 
is repressed, indeed, it is actually a removal of 
the repression, though not, of course, an accept- 
ance of what is repressed,” may be applicable to 

‘this patient's speech syndrome, since the speech 
reversal permits the verbalization of ideas which 
the patient consciously rejects. Freud’s view (3, 
p. 185) would also lead us to examine how the 
patient deals with hostility, since the withdrawal 
of libidinal cathexes of objects in schizophrenia 
presumably leaves the schizophrenic’s aggressive 
impulses without restraint. 
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One of the features of this patient’s personal- 
ity, remarked by all who have had contact with 
him, is his extreme passivity. On only one occa- 
sion during his more than six months of current 
hospitalization did the patient become overtly ag- 
gressive. This was when his mother, who had 
been visiting regularly, appeared one day when 
he was desperately requesting release from the 
hospital. Seeing her, he began to pound on the 
door with his fists and to accuse her of keeping 
him locked up. He calmed down within the space 
of a few minutes and had a very quiet visit with 
his mother a short time later. He has been as- 
saulted several times by other patients, but in each 
case he accepted the attack without attempting to 
fight back. Further clues pointing toward the 
importance of hostility in this patient’s pathology 
and in his peculiar speech syndrome are the facts 
that his original breakdown occurred during com- 
bat and that his delusional and hallucinatory pro- 
ductions have been primarily sexual and religious, 
with relatively little forthright aggressive content. 
We do not wish to convey the impression that 
aggressive tendencies have completely disappeared 
in this patient, but we do stress the notable lack 
of such behavior in situations which seem to 
invite it. 

The process by which unacceptable aggres- 
sive impulses—if the suggested relationship is 
correct—are dealt with by the speech mechanism 
is not clear. The reduction of communication 
with others is the most tangible thing to which 
one can point. We would predict, on the basis 
of our reasoning, that if this man’s speech syn- 
drome were to clear we would either see the out- 
break of aggression or’ find aggressive content in 
his fantasies. 


SUMMARY 


An unusual speech syndrome of a schizo- 
phrenic patient, in which “yes” and “no” and 
other opposites are reversed without awareness 
on the part of the patient, has been described. 
In discussing the reversal of a strongly established 
linguistic discrimination, it was pointed out that 
learning theorists have largely neglected the study 
of pathological language. Psychiatric and psy- 
choanalytic studies of schizophrenia and patho- 
logical language provide some rationale for under- 
standing the language distortion described. Evi- 
dence was offered to support the view that the 
opposite speech of the patient served primarily as 
a means of coping with hostile impulses. 
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HYPOTHESES OF OPPOSITE SPEECH * 


JULIUS LAFFAL AND LANE AMEEN * 


Subsequent to a paper by the present authors 
on opposite speech in a schizophrenic patient (2), 


* Reprinted by permission from The Journal of 
Abnormal and Social Psychology, March, 1959, Vol. 
58, No. 2, 267-269. 

1This study is part of a continuing research on 
language distortions in schizophrenia, which is sup- 
ported by USPHS Grant M-2020. 


two commentaries were published. Staats (3) 
suggested possible etiological factors in opposite 
speech and some ways of dealing with it based on 
principles of reinforcement. Kaplan (7) at- 
tempted to interpret the phenomenon in terms of 
Werner’s developmental views. In this note, we 
wish to make clear our differences with these two 
authors. An excerpt from an early recorded con- 
tact with the patient will illustrate what we have 
called opposite speech. 

Dr.: . . . who invented the airplane? 

Pt.: I do know. 

Dr.: You mean, you don’t know. 

Pt.: I do know. 

Dr.: You do know. 

Pt.: Yes, I do. 

Dr.: If you do know, can you tell me? 

Pt.: If I do know, how can I tell you? I could. 

Dr.: You could tell me. 

Pt.: Yes, because I do know. I do know, I do 
know, ah, who invented the airplane. 

Dr.: Okay, if you do know who invented the 
airplane, tell me who invented the airplane. 

Pt.: I can. 

Dr.: You can. 

Pt.: I sure could. 

Dr.: You sure could. Okay, can you tell me 
now who invented the airplane? 

Pt.: I do know. 

Dr.: You do know. 

Pt.: Yes, I know. 

Dr.: That means that you have the answer. 
You have the answer to that question. 

Pt.: Yes. 

Dr.: Yes. All right, now can you tell me what 
the answer is? 

Pt.: Who invented the airplane, I do know. 

Dr.: What you mean to say is that you don’t 
know. 

Pt.: I do know. If I don’t know, Ba Rs aS i 
wouldn't be able to tell you. 

Dr.: You're not able to tell me, though, are 
you? 

Pt.: Yes, I am, for I do know. 

The developments in this case, which cover a 
period of nearly three years since the first report, 
are briefly as follows. The patient recovered from 
his psychosis while in the hospital and has been 
working and living productively outside of the 
hospital for the past 20 months. Immediately 
preceding his recovery, during a trial visit at 
home, he became extremely hostile and threaten- 
ing toward his parents; even fearing for their 
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lives, they returned the patient to the hospital with 
the help of the police. It was at this time that 
the opposite speech was first noted to be absent, 
and it was shortly after his return to the hospital 
that the patient’s psychosis cleared up. The cir- 
cumstances surrounding the disappearance of the 
opposite speech gave striking support to the first 
alternative of our prediction (2, p. 413) “that if 
this man’s speech syndrome were to clear we 
would either see the outbreak of aggression or 
find aggressive content in his fantasies.” In the 
long retrospect, it is now apparent that the sec- 
ond alternative, relating to aggressive fantasy, may 
also be correct. Periodically in the therapy which 
has continued on a once-a-month basis since the 
patient left the hospital, he has brought up, with 
great hesitation and embarrassment, an anxiety- 
ridden, obsessive fantasy of being unexpectedly 
seized and subjected to shock treatment. 

The points which Staats (3) makes about the 
patient’s syndrome are (a) the opposite speech 
elicits attention from others, and this serves as 
a reinforcement of the speech pattern; (b) lack 
of reinforcement, or withholding compliance with 
the patient’s desires when he uses opposite speech, 
should weaken the opposite speech; and (c) anx- 
iety accounts for the opposite speech, correct 
speech being anxiety-arousing for the patient be- 
cause of its typical content. Staats (3, p. 269) 
suggests as the source of the anxiety, “that the 
unhappy life situation of an adult schizophrenic 
probably elicits thought and speech which are not 
positive secondary reinforcers, but instead arouse 
anxiety.” 

The first of these points was considered in the 
original paper (2, p. 410) in terms of the possi- 
bility that the opposite speech might be persisting 
because of certain secondary rewards, i.e., because 
of the interest it aroused. However, it was noted 
that relatively little time was devoted to the sheer 
exploration of the patient’s speech syndrome it- 
self, the emphasis in all contacts being on treat- 
ment. A strong argument that interest did not 
maintain the symptom is that the symptom disap- 
peared while we continued to be interested in it. 

Regarding Staats’s second point: Despite dem- 
onstrations and remonstrances by doctors and 
others that he was reversing his speech, the pa- 
tient continued to do so, Early in his hospitaliza- 
tion, the patient repeatedly demanded in opposite 
speech that he be discharged. The staff was far 
from complying with this request, but there was 
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no evidence that the opposite speech changed on 
that account. It is true that no systematic effort 
was made to withhold reinforcement or to chide 
the patient whenever he used opposite speech. 
However, there is a paradox in Staats’s argument 
in that to have used such a systematic approach 
would, of necessity, have involved giving the op- 
posite speech the very attention which Staats be- 
lieves supported it. 

With respect to Staats’s final point, anxiety was 
certainly a factor in the opposite speech. How- 
ever, since the patient returned to his ordinary 
life situation and lost the symptom, it is doubtful 
that the syndrome is attributable to his “unhappy 
life situation.” 

Kaplan’s (7) treatment of the opposite speech 
syndrome is an attempt to conceive this phenome- 
non in terms of the developmental views of Wer- 
ner. We differ with him on this central thesis 
(1, p. 390) that “opposite speech is presumed to 
arise because the content to which the linguistic 
signs refer is in such a global, undifferentiated 
state that the linguistic vehicles (seemingly dis- 
crete) really share the same global referent.” In 
our original paper we pointed out that opposite 
speech consists (2, p. 409) “basically in the use 
of ‘yes’ by the patient when he means ‘no,’ and 
vice versa... it includes the interchange of 
‘ight’ and ‘wrong,’ the interchange of ‘do’ and 
‘don’t,’ and occasionally the interchange of such 
opposites as ‘something’ and ‘nothing’ and ‘like’ 
and ‘hate.’” We stressed the disparity between 
the conscious intention of the patient and the ver- 
balizations he used to communicate the intention. 
Kaplan (1, p. 390), in talking about “a lack of 
differentiation in the use of affirmative and nega- 
tive forms of judgment, e.g., ‘I know,’ ‘I don’t 
know,’ etc.,” is not talking about opposite speech 
as we have described it, but about a phenomenon 
more in keeping with his developmental notions, 
namely, an early state of language development 
in which some area of reference is not sufficiently 
discriminated by the subject to permit distinctive 
verbal labeling either of parts or even of extremes 
of the area. k 

In the opposite speech of the patient under con} 
sideration, it is not that the referent or the speech 
is undifferentiated, but that the language opera- 
tions are appropriate to the opposite of the refer- 
ent. The patient intends something quite succinct 
and differentiated, but uses opposite speech. Thus, 
if one went through the interview excerpted above 
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and interchanged yes and no and do and don’t, 
one would have a nearly normal sounding conver- 
sation. Such a high degree of consistency of re- 
versal would indicate that the referents remain dis- 
tinct and differentiated, and also that opposite 
Janguage operations are well discriminated by the 
patient. The pathology lies in the fact that the 
language used by the patient is appropriate to the 
opposite of his referents or of what he intends. 

The opposite speech phenomenon is open to 
many interpretations and may be looked at from 
many points of view. The interpretation originally 
offered (2) that it was a function of fear of ag- 
gression has been consistent both with the cir- 
cumstances surrounding the disappearance of the 
opposite speech, and with the subsequent content 
of the patient’s production in therapy. 
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A CASE OF MULTIPLE PERSONALITY * 


COoRBETT H. THIGPEN 
AND HERVEY CLECKLEY 


The psychiatric manifestation called multiple 
personality has been extensively discussed. So 
too have the unicorn and the centaur. Who has 
not read of these legendary quadrupeds? Their 
pictures are, perhaps tiresomely, familiar to any 
schoolboy. Can one doubt that during medieval 
times many twilight encounters with the unicorn 
were convincingly reported? Surely in the days 


* Reprinted by permission from The Journal of 
Abnormal and Social Psychology, January, 1954, Vol. 
49, No. 1, 135-151. 
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of Homer there were men of Thessaly or Beotia 
who had seen, or even ridden, centaurs almost as 
wise as Chiron. 

The layman who at college took a course in 
psychology may feel that for him dual personality, 
or multiple personality, is a familiar subject. Some 
psychiatrists’ reactions suggest they are inclined 
to dismiss this subject as old hat. Nevertheless, 
like the unicorn and the centaur in some respects, 
multiple personality, despite vivid appearances in 
popularized books on psychology (2), is not com- 
monly encountered in the full reality of life Gh 
16, 17). Nearly all those perplexing reports of 
two or more people in one body, so to speak, that 
arouse a unique interest in the classroom, are 
reports of observations made in a relatively distant 
past. The most significant manifestations of this 
sort discussed in the current literature occurred in 
patients studied half a century or more ago (13, 
23). It is scarcely surprising that practical psy- 
chiatrists today, never having directly observed 
such things as Morton Prince found in Miss Beau- 
champ or as Azam reported of Felida, might hold 
a tacitly skeptical attitude toward such archaic 
marvels and miracles. In the fields of internal 
medicine and chemistry the last, or even the mid- 
dle, decades of the nineteenth century are close 
to us. In the relatively new field of psychopath- 
ology they are almost primeval, a dim dawn era 
in which we find it easy to suspect that a glimpse 
of a rhinoceros might have led to descriptions of 
the unicorn, or the sound of thunder been misin- 
terpreted as God’s literal voice. 

A reserved judgment toward what cannot be 
regularly demonstrated is not necessarily deplor- 
able. Some current tendencies suggest that our 
youthful branch of medicine may not yet have 
emerged from its primordial and prerational 
phase. The discovery of orgone by one of our 
erstwhile leaders in the development of “psycho- 
dynamics” should not be ignored (4, 25). En- 
thusiastically adduced “proof” from an adult’s 
dream that he was as an embryo significantly 
traumatized by fear of his father’s penis, which 
during intercourse threatened him from his moth- 
er’s vagina, is, we believe, the sort of evidence 
toward which our “resistance” is not without value 
(21). Despite Morton Prince’s exquisitely thor- 
ough study of the celebrated Miss Beauchamp 
(23, 24) it is not surprising that decades ago 
McDougall should have warned us: 
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It has been suggested by many critics that, in 
the course of Prince’s long and intimate deal- 
ings with the case, involving as it did the fre- 
quent use of hypnosis, both for exploratory and 
therapeutic purposes, he may have moulded the 
course of its development to a degree that can- 
not be determined. This possibility cannot be 
denied (16, p. 497). 

It is perhaps significant to note that, despite 
the light (or at least the half-light) they throw on 
most of the puzzling manifestations of psychiatric 
disorder, the studies of Prince and others on mul- 
tiple personality are not even mentioned in some 
of the best and most popular textbooks of psy- 
chiatry used in our medical schools today (19, 
26). When mentioned at all in such works, the 
subject is usually dismissed with a few words 
(11, 20). It is particularly noteworthy that 
Freud, during his years of assiduous investigation, 
apparently displayed no appreciable interest in the 
development of this disorder. Erickson and 
Kubie cite one brief allusion (9) which they term 
his “only reference to the problem” (6). 

Psychiatrists who would not deny outright the 
truly remarkable things reported long ago about 
multiple personality, even when accepting them 
passively in good faith seem often to do so per- 
functorily. In the midst of clinical work, with its 
interesting immediate experiences and pressing 
demands, few are likely to focus a major interest 
on what is known to them only through dust-cov- 
ered records, on what they have never encoun- 
tered, and don’t expect to deal with. During the 
complications and excitements of a stormy sea’ 
voyage even the most sincere believer in the 
miracle of Jonah will probably not look to whales 
for his chief solution of problems that may arise 
from shipwreck. 

Our direct experience with a patient has forced 
us to review the subject of multiple personality. 
It has also provoked in us the reaction of won- 
der, sometimes of awe. 

One of us (C. H. T.) had for several months 
been treating a twenty-five-year-old married 
woman who was referred because of “severe and 
blinding headaches.” At the first interview she 
also mentioned “blackouts” following headache. 
These were vaguely described by the patient. 
Her family was not aware of anything that would 
suggest a real loss of consciousness or serious 
mental confusion. During a series of interviews 
which were irregular, since the patient had to 
come from some distance away, several important 
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emotional difficulties were revealed and discussed. 
Encouraging symptomatic improvement occurred, 
but it was plain that this girl’s major problems had 
not been settled. To the therapist, Eve White— 
as we shall call her—was an ordinary case with 
commonplace symptoms and a relatively complex 
but familiar constellation of marital conflicts and 
personal frustrations. We were puzzled during 
therapy about a recent trip for which she had no 
memory. Hypnosis was induced and the amnesia 
cleared up promptly. Several days after a visit 
to the office a letter was received. (Exhibit 1.) 

What was the meaning of such a letter? Though 
unsigned, the postmark, the content, and the fa- 
miliar penmanship in most of the message revealed 
to the therapist that this had been written by 
Eve White. The effect of this letter on the thera- 
pist was considerable. It raised puzzling ques- 
tions for which there were no answers and set in 
motion thoughts that pursued various and vague 
directions. Had some child found the uncom- ` 
pleted page, scribbled those words, and, perhaps 
as a whim, mailed it in an already addressed en- 
velope? Perhaps. The handwriting of the last 
paragraph to be sure suggested the work of a 
child. Could Eve White herself, as a puerile 
prank, have decided to disguise her characteristic 
writing and added this inconsequential note? 
And if so, why? Mrs. White had appeared to be 
a circumspect, matter of fact person, meticu- 
lously truthful and consistently sober and serious 
about her grave troubles. It was rather difficult 
to imagine her becoming playful or being moved 
by an impulse io tease, even on a more appro- 
priate occasion. The “blackouts” which she had 
rather casually mentioned, but which did not seem 
to disturb her very much, suggested of course that 
a somnambulism or brief fugue might have oc- 
curred. 

On her next visit she denied sending the letter, 
though she recalled having begun one which she 
never finished. She believed she had destroyed it. 
During this interview Eve White, ordinarily an 
excessively self-controlled woman, began to show 
signs of distress and agitation. Apprehensively 
and reluctantly she at last formulated a question: 
Did the occasional impression of hearing an im- 
aginary voice indicate that she was “insane”? 

To the therapist this information was startling. 
Nothing about Eve White suggested even an early 
schizoid change. Her own attitude toward what 
she now reported was in no respect like any of 
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the various attitudes of patients who are in the 
ordinary sense experiencing auditory hallucina- 
tions. Yet, she insisted with painful embarrass- 
ment, she had on several occasions over the last 
few months heard briefly but distinctly a voice 
addressing her. Something about her reaction to 
this may be conveyed if we compare it to what 
we can imagine an experienced psychiatrist in 
robust mental health might feel if, with full re- 
tention of insight, he heard himself similarly ad- 
dressed. While the therapist, hesitating a moment 
in wonder, sought for an adequate reply, an 
abstruse and inexplicable expression came, appar- 
ently unprompted by volition, over Eve White’s 
familiar countenance. As if seized by a sudden 
pain she put both hands to her head. After a 
tense moment of silence, her hands dropped. 
There was a quick, reckless smile and, in a bright 
voice that sparkled, she said, “Hi there, Doc!” 

The demure and constrained posture of Eve 
White had melted into buoyant repose. With a 
soft and surprisingly intimate syllable of laughter, 
she crossed her legs. Disconcerted as he was by 
unassimilated surprise, the therapist noted from 
the corner of his awareness something distinctly 
attractive about them, and also that this was the 
first time he had received such an impression. 
There is little point in attempting here to give in 
detail the differences between this novel feminine 
apparition and the vanished Eve White. Instead 
of that retiring and gently conventional figure, 
there was in the newcomer a childishly daredevil 
air, an erotically mischievous glance, a face mar- 
vellously free from the habitual signs of care, 
seriousness, and underlying distress, so long fa- 
miliar in her predecessor. This new and appar- 
ently carefree girl spoke casually of Eve White 
and her problems, always using she or her in 
every reference, always respecting the strict 
bounds of a separate identity. When asked her 
own name she immediately replied, “Oh, I’m Eve 
Black.” 

It is easy to say that this new voice was dif- 
ferent, that the basic idiom of her language was 
plainly not that of Eve White. A thousand 


EXHIBIT I 


This letter in retrospect was the first intimation 

that our patient was unusual. The dramatic and 

unexpected revelation of the second personality 
shortly followed. 
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minute alterations of manner, gesture, expression, 
posture, of nuances in reflex or instinctive reac- 
tion, of glance, of eyebrow tilting and eye move- 
ment, all argued that this could only be another 
woman. It is not possible to say just what all 
these differences were. 

It would not be difficult for a man to distin- 
guish his wife, or perhaps even his secretary, 
if she were placed among a hundred other women 
carefully chosen because of their resemblance to 
her, and all dressed identically. But few would 
wager that, however articulate he might be, he 
could tell a stranger, or even someone very slightly 
acquainted with her, how to accomplish this task. 
If he tries to tell us how he himself recognizes 
her, he may accurately convey something to us. 
But what he can convey, no matter how hard he 
tries, is only an inconsequential fragment. It is 
not enough to help us when we set out to find her. 
So, too, we are not able to tell adequately what 
so profoundly distinguishes from Eve White the 
carefree girl who took her place in this vivid 
mutation. 

Even before anything substantial of her history 
could be obtained, the therapist reacted to the new 
presence with feelings that momentarily recalled 
from distant memory these words: 


The devil has entered the prompter’s box 
And the play is ready to start. 


Over a period of 14 months during a series of 
interviews totaling approximately 100 hours, ex- 
tensive material was obtained about the behavior 
and inner life of Eve White—and of Eve Black. 
It is our plan to report on this more adequately 
in a book-length study. Here space limits our 
presentation to a few details. 

Eve Black, so far as we can tell, has enjoyed an 
independent life since Mrs. White’s early child- 
hood. She is not a product of disruptive emo- 


1 The question: “How can the various personalities 
be called out?” has been asked. After the original 
spontaneous appearance of Eve Black it was at first 
necessary for Eve White to be hypnotized in order 
for us to talk with Eve Black. How Eve Black could 
“pop out” of her own accord at unpredictable times 
and yet could not come out on request, we do not 
know. Under hypnosis of Eve White, Eve Black 
could very easily be called forth. After a few hyp- 
notic sessions, we merely had to request Eve White 
to let us speak to Eve Black. Then we called Eve 
Black’s name, and Eve Black would come forth. The 
reverse was true when Eve Black was out and we 
wished to speak with Eve White. Hypnosis was no 


tional stresses which the patient has suffered dur- 
ing recent years. Eve White apparently had no 
knowledge or suspicion of the other’s existence 
until some time after she appeared unbidden be- 
fore the surprised therapist. Though Mrs. White 
has learned that there is a Miss Black during the 
course of therapy, she does not have access to the 
latter’s awareness. When Eve Black is “out,” 
Eve White remains functionally in abeyance, quite 
oblivious of what the coinhabitant of her body 
does, and apparently unconscious. 

On the contrary, Eve Black preserves awareness 
while absent. Invisibly alert at some unmapped 
post of observation, she is able to follow the ac- 
tions and the thoughts of her spiritually anti- 
thetical twin. The hoydenish and devil-may-care 
Eve Black “knows” and can report what the other 
does and thinks, and describes her feelings. Those 
feelings, however, are not Eve Black’s own. She 
does not participate in them. Eve White’s genu- 
ine and natural distress about her failing marriage 
is regarded by the other as silly. Eve White’s 
love and deep concern for her only child, a little 
girl of four, is to us and to all who know her, 
warm, real, consistent, and impressive. Eve 
Black, who shares her memory and verbally knows 
her thoughts, discerns her emotional reactions and 
values only as an outsider. They are for the out- 
sider something trite, bothersome, and insignifi- 
cant. The devotion of this mother for her child, 
as an empty definition, is entirely familiar to the 
lively and unworried Eve Black. Its substance 
and nature are, however, so clearly outside her 
personal experience that she can evaluate it only 
as “something pretty corny.” 

During the temporary separation of her par- 
ents, which may become permanent, this little 
girl is living with her grandparents in a village. 
Because her earnings are necessary for her child’s 
basic welfare, the mother has no choice but to 
work and live in a city approximately a hundred 
miles from the child. Having apparently known 
little but unhappiness with her husband, she was 
finally forced to the conclusion that her young 
and vulnerable child had little chance of happy 
or normal development in the home situation, 


longer necessary for the purpose of obtaining the 
changes. This made things simpler for us but com- 
plicated Eve White’s life considerably because Eve 
Black found herself able to “take over” more easily 
than before. A third personality, Jane, to be de- 
scribed below, emerged spontaneously and we have 
never had to employ hypnosis to reach her. 
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which, despite her best efforts, continually grew 
worse. She now endures the loneliness, frustra- 
tion, and grief of separation from her warmly 
loved daughter, who is the primary object of her 
life and feeling, and who, she has good reason to 
fear, is likely to grow up apart from her. Per- 
haps, it seems to her sometimes, she will become 
to her as years pass little more than a coolly ac- 
cepted stranger. 

Vulnerable, uningenious, and delicately fem- 
inine, Eve White characteristically preserves a 
quiet dignity about personal sorrow, a dignity 
unpretentiously stoic. Under hypnosis one can 
come closer to the sadness and the lonely despair 
she feels it her task not to display. Even then no 
frantic weeping occurs, no outcries of self-pity. 
Her quiet voice remains level as she discusses 
matters that leave her cheeks at last wet from 
silent tears. 

Despite access to this woman’s “thoughts” Eve 
Black has little or no real compassion for her. 
Nor does she seem in any important sense actively, 
or purposefully, cruel. Neutral or immune to 
major affective events in human relations, an un- 
participating onlooker, she is apparently almost 
as free of hatefulness, or of mercy, or of com- 
prehension, as a bright-feathered parakeet who 
chirps undisturbed while watching a child strangle 
to death. 

It has been mentioned that Eve Black’s career 
has been traced back to early childhood. She 
herself freely tells us of episodes when she 
emerged, usually to engage in acts of mischief 
or disobedience. She lies glibly and without com- 
punction, so her account alone can never be taken 
as reliable evidence. Since Eve White, whose 
word on any matter has always proved good, still 
has no access to the other’s current awareness Or 
her memory and, indeed, did not until recently 
even faintly suspect her existence, it has been im- 
possible through her to check fully and immedi- 
ately on Eve Black’s stories. Her memory has, 
however, afforded considerable indirect evidence 
since she has been able to confirm reports of 
punishments she received, of accusations made 
against her, for deeds unknown to her but de- 
scribed to us by Eve Black. 

Some stories have been substantiated through 
others. Both of this patient’s parents, as well as 
her husband, have been available for interviews. 
They recall several incidents that Eve Black had 
previously reported to us. For instance, the par- 


ents had had to punish their ordinarily good and 
conforming six-year-old girl for having disobeyed 
their specific rule against wandering through the 
woods to play with the children of a tenant farmer. 
They considered this expedition dangerous for so 
young a child, and their daughter’s unaccountable 
absence had caused them worry and distress. On 
her return, Eve received a hearty whipping despite 
her desperate denials of wrongdoing or disobedi- 
ence. In fact these very denials added to her 
punishment, since the evidence of her little trip 
was well established and her denial taken as a 
deliberate lie. Eve Black had previously described 
this episode to us in some detail, expressing 
amusement about “coming out” to commit and 
enjoy the forbidden adventure and withdrawing 
to leave the other Eve, sincerely protesting her 
innocence, to appreciate all sensations of the 
whipping. 

The adult Eve White recalled this and several 
other punishments which she had no way of un- 
derstanding and which sometimes bewildered her 
in her relations with her parents. 

Irresponsibility and a shallowly hedonistic 
grasping for ephemeral excitements or pleasures 
characterize Eve Black’s adult behavior. She 
succeeded in concealing her identity not only 
from the other Eve but also from her parents and 
the husband. She herself denies marriage to this 
man, whom she despises, and any relation to 
Eve White’s little girl except that of an uncon- 
cerned bystander. Though she had often “come 
out” in the presence of all these people, she went 
unrecognized until she agreed to reveal herself to 
them in the therapist’s office. 

Her wayward behavior, ill will, harshness, and 
occasional acts of violence, observed by Mr. 
White and the parents, were attributed to unac- 
countable fits of temper in a woman habitually 
gentle and considerate. 

During her longer periods “out,” when she 
expresses herself more freely in behavior so un- 
like that of Eve White, she avoids her family and 
close friends, and seeks the company of strangers 
or of those insufficiently acquainted with her al- 
ternate to evaluate accurately the stupendous 
transformation. 

Once we had seen and spoken with Eve Black, 
it seemed to us at first scarcely possible that, even 
in the same body as her alternate, she could for 
so long have concealed her separate identity from 
others. Yet, who among those acquainted with 
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her would be likely to suspect, however unlike 
herself Eve appeared at times to be, such a situa- 
tion as that voluntarily revealed to us by the 
patient? No matter how many clues one is given, 
no matter how obvious the clues, one will not be 
led to a conclusion that is inconceivable. One 
will seek explanations for the problem only among 
available hypotheses. 

Not knowing the only concept into which suc- 
cessive details of perception will fit, even a very 
astute man may observe a thousand separate fea- 
tures of something his imagination has never 
shaped without grasping the gestalt, without be- 
ing able to put into a recognizable whole the 
details he has so clearly detected. Only our pre- 
vious familiarity with three-dimensional space en- 
ables us to see the representation of depth in a 
picture. What is for us still unconceived can give 
us a thousand hints, boldly flaunt before us its 
grossest features, and remain for us undelineated, 
formless, uncomprehended as an entity. 

The astonishingly incompatible gestures, expres- 
sions, attitudes, mannerisms, and behavior which 
Eve occasionally displayed before intimates pro- 
voked thought and wonder, demanded explana- 
tion. But who in the position of these people 
would be likely to find or create in his mind the 
hypothesis that forms a recognizable image? Let 
us remember too that Eve Black, until she volun- 
tarily named herself to the therapist, meant to re- 
main unrecognized. When it suits her, she de- 
liberately and skillfully acts so as to pass herself 
off as Eve White, imitating her habitual tone of 
voice, her gestures, and attitudes, Let us not 
forget that she is shrewd, Would it not, after all, 
require a sledge-hammer blow from the obvious 
to drive into an unsuspecting acquaintance the 
only hypothesis that would lead to her recogni- 
tion? ? 

Psychometric and projective tests conducted on 
the two Eves by a well-qualified expert were re- 
ported thus: 


2 Eve White’s husband and parents were troubled 
by the unexplicable changes in her. They assumed 
them to be “fits of temper” about which she lied. 
Her mother called the fugues of her daughter these 
“strange little habits.” Apparently these people ob- 
served the same changes that we have observed, but 
unlike ourselves, they have not had the conception of 
multiple personality in mind, Lacking it, they could 
not use it as an explanatory construct, 
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PSYCHOLOGICAL CONSULTATION REPORT 


This twenty-five-year-old married female patient 
was referred for psychological examination with a 
provisional diagnosis of dual personality. Two 
complete psychological examinations were re- 
quested, one of the predominant personality, Mrs. 
White, the other, . . . of the secondary person- 
ality, Miss Black. 

The patient is the oldest of three siblings, hav- 
ing twin sisters. She quit school two months be- 
fore graduation from high school. She was em- 
ployed as a telephone operator. She has been 
married six years and has a girl four years old. 
Patient states that she did things recently she can- 
not remember having done, and expresses serious 
concern about this condition. The following psy- 
chological tests were administered in both exami- 
nations: 

Wechsler-Bellevue Intelligence Scale 

Wechsler Memory Scale 

Drawings of Human Figures 

Rorschach 

Test Behavior—Patient was neat, friendly, and 
cooperative. However, while Mrs. White was 
more serious, more conscientious, and displayed 
more anxiety, Miss Black appeared somewhat less 
anxious and was satisfied with giving more super- 
ficial responses. Still the basic behavior pattern 
was very similar in both personalities, indicating 
that inhibitory forces were not markedly abolished 
even in the role of the desired personality. Speech 
was coherent, and there were no distortions in 
ideation or behavior according to the assumed per- 
sonality. No psychotic deviations could be ob- 
served at the present time. 

Tests Results—While Mrs. White is able to 
achieve an IQ of 110 on the Wechsler-Bellevue 
Intelligence Scale, Miss Black attains an IQ of 104 
only, There is evidence that the native intellec- 
tual endowment is well within the bright normal 
group; however, in Mrs. White’s case anxiety and 
tenseness interfere, in Miss Black’s superficiality 
and slight indifference as to achievement are re- 
sponsible for the lower score. While Mrs. White 
shows more obsessional traits, Miss Black shows 
more hysterical tendencies in the records. It is 
interesting to note that the memory function in 
Miss Black is on the same level as her Intelli- 
gence Quotient, while Mrs. White’s memory func- 
tion is far above her IQ, although she complained 
of a disturbance of memory. The only difficulty 
encountered by both personalities is on recall of 
digits, a peformance in which telephone operators 
usually excel! On the other hand, the Rorschach 
record of Miss Black is by far healthier than the 
one of Mrs. White. In Miss Black’s record a 
hysterical tendency is predominant, while Mrs. 
White’s record shows constriction, anxiety, and 
obsessive compulsive traits. Thus Miss Black is 
able to conform with the environment, while Mrs. 
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White is rigid and not capable of dealing with her 
hostility. 

Personality Dynamics —A comparison of the 
projective tests indicates repression in Mrs. White 
and regression in Miss Black. The dual person- 
ality appears to be the result of a strong desire 
to regress to an early period of life, namely the 
one before marriage. Miss Black is actually the 
maiden name of Mrs. White. Therefore, there are 
not two different personalities with completely dis- 
similar ideation, but rather one personality at two 
stages of her life. As a characteristic for this type 
of case, the predominant personality is amnesic 
for the existence, activities, or behavior of the sec- 
ondary or subordinate system, while the secondary 
personality is aware and critical of the predomi- 
nant personality’s activities and attitudes. The 
latter reaction is quite similar to the ego-conflict 
in obsessive compulsive disturbances. 

Mrs. White admits difficulty in her relation with 
her mother, and her performance on the Ror- 
schach and drawings indicate conflict and result- 
ing anxiety in her role as a wife and mother. Only 
with strong conscious effort can she compel her- 
self to subject herself to these roles. The en- 
forced subjection results in ever increasing hos- 
tility. This hostility, however, is not acceptable 
to her, and activates a defense mechanism of re- 
gression to avoid severe guilt feelings, by remov- 
ing the entire conflictual situation from conscious 
awareness. At the same time, the new situation 
(in which she plays the role of Miss Black) per- 
mits her to discharge some of her hostility towards 
Mrs. White. Miss Black on the other hand has 
regained her previous status of freedom from 
marital and maternal conflicts, and thus has liber- 
ated herself from the insoluble situation in which 
Mrs. White found herself through her marriage. 
In addition, she can avert the—in her conviction 
—inevitable spiritual loss of her child. Thus, it 
is not surprising that she shows contempt for 
Mrs. White who permitted herself to become in- 
volved in such a situation because of her lack of 
foresight, as well as her lack of courage to force- 
fully solve the dilemma. i 

Actually the problem started at a much earlier 
period of life, with a strong feeling of rejection 
by her parents, especially after the birth of her 
twin sisters. Mrs. White loves them dearly, Miss 
Black despises them. In this connection an epi- 
sode is related by Miss Black. After quitting 
school to help support the family, she (that is to 
say Mrs. White) sent home money to be used 
for overcoats for her twin sisters, denying herself 
a badly wanted wristwatch. When the money was 
used to buy them two wristwatches instead of over- 
coats, she reacted with strong, but repressed, hos- 
tility. Significantly, she removed her wristwatch 
while examined as Mrs. White, stating that she 
doesn’t like jewelry. There are several illustra- 


tions of her strong sense of rejection as well as 
sibling rivalry in her records. 

Leopold Winter, Ph.D. 

Clinical Psychologist 

j U. S. Veterans 
Administration Hospital 

Augusta, Georgia 

July 2, 1952 

With the circumspect Eve White oblivious of 
her escapades, Miss Black once recklessly bought 
several expensive and unneeded new dresses and 
two luxurious coats. Sometimes she revels in 
cheap night clubs, flirting with strange men on 
the make. Insouciantly she pursues her irre- 
sponsible way, usually amused, sometimes a lit- 
tle bored, never alarmed or grieved or seriously 
troubled. She has, apparently, been unmoved by 
any sustaining purpose, unattracted by any steady 
goal, prompted only by the immediate and the 
trivial. 

Eve White’s husband, on discovering the valu- 
able outlay of new clothes, which the other Eve 
had hidden carefully away, lost his temper and 
abused his wife for wantonly plunging him into 
debt. He found no way to accept her innocent 
denials as genuine but was at length assuaged in 
wrath by her wholehearted agreement that it 
would be disastrous for them to run up such a 
bill, and her promptness in returning all these 
garments to the store.” Eve White has told us of 
many real and serious incompatibilities with her 
husband. Even if the two were unmolested by an 
outsider, it is doubtful if the imperfections of 
this marriage, its unhappiness, and the threats to 
its continuation could be alleviated, Adverse acts 
and influence by an insider have been peculiarly 
damaging and pernicious. Though Eve Black does 
not apparently follow a consistent purpose to 
disrupt the union, or regularly go out of her way 
to make trouble for the couple, her typical be- 
havior often compounds their difficulties. 

“When I go out and get drunk,” Eve Black 
with an easy wink once said to both of us, “she 

3Mrs. White apparently failed to produce a satis- 
factory rationalization. This is true for all of her 
fugue states. She did tell us she suspected that her 
husband may have planted the clothes in order to 
make it appear that she was “insane.” She did not, 
however, seem to come to grips with the problem. 
Apparently finding it, along with so many other prob- 
lems, too much for her, she took an attitude in some 
ways like that of Scarlett O'Hara when the latter 
would tell herself, “Well, tomorrow will be another 
day.” 
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wakes up with the hangover. 
in the hell’s made her so sick.” 
Though as a rule only indifferent, passively 
callous to her alternate’s child, Eve Black once in 
the past became irritated with her and hurt her. 
Apparently she might have done her serious harm 
had her husband not restrained her. This act 
she denied and lied about consistently though the 
evidence for it through others is strong. Later 


She wonders what 


EVE WHITE 


Demure, retiring, in some respects almost saintly 

Face suggests a quiet sweetness; the expression of 
repose is predominantly one of contained sad- 
ness 

Clothes: simple and conservative, neat and incon- 
spicuously attractive 

Posture: tendency to a barely discernible stoop or 
slump. Movements careful and dignified 

Reads poetry and likes to compose verse herself 

Voice always softly modulated, always influenced 
by a specifically feminine restraint 

Almost all who know her express admiration and 
affection for her. She does not provoke envy. 
Her strength of character is more passive than 
active. Steadfast on defense but lacking initia- 
tive and boldness to formulate strategy of attack 

An industrious and able worker; also a competent 
housekeeper and a skillful cook. Not colorful 
or glamorous. Limited in spontaneity 

Consistently uncritical of others. Tries not to 
blame husband for marital troubles, Nothing 
Suggests pretense or hypocrisy in this charitable 
attitude 

Though not stiffly prudish and never self-right- 
eous, she is seldom lively or playful or inclined 
to tease or tell a joke. Seldom animated 

Her presence resonates unexpressed devotion to 
her child. Every act, every gesture, the demon- 
strated sacrifice of personal aims to work hard 
for her little girl, is consistent with this love 

Cornered by bitter circumstances, threatened with 
tragedy, her endeavors to sustain herself, to de- 
fend her child, are impressive 

This role in one essentially so meek and fragile 
embodies an unspoken pathos. One feels some- 
how she is doomed to be overcome in her Pres- 
ent situation 

No allergy to nylon has been Teported 
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she flippantly confessed, giving as her reason, 
“The little brat got on my nerves.” 

Abstract terms and other descriptive words are 
not likely to convey much of what one experi- 
ences directly of a human being, of a specific per- 
sonal entity. Nor could any list of ten thousand 
such items be even near complete. Let us, never- 
theless, set down for what they are worth a few 


points: 


EVE BLACK 


Obviously a party girl. Shrewd, childishly vain, 
and egocentric 

Face is pixie-like; eyes dance with mischief as if 
Puck peered through the pupils 

Expression rapidly shifts in a light cascade of fun- 
loving willfulness. The eyes are as inconstant 
as the wind. This face has not and will never 
know sadness. Often it reflects a misleading 
and only half-true naivete 

Voice a little coarsened, “discultured,” with echoes 
or implications of mirth and teasing. Speech 
richly vernacular and liberally seasoned with 
spontaneous gusts of rowdy wit 

A devotee of pranks. Her repeated irresponsibili- 
ties have cruel results on others. More heed- 
less and unthinking, however, than deeply ma- 
licious. Enjoys taunting and mocking the Sia- 
mese alternate 

All attitudes and passions whim-like and momen- 
tary. Quick and vivid flares of many light feel- 
ings, all ephemeral 

Immediately likable and attractive. A touch of 
sexiness seasons every word and gesture. 
Ready for any little, irresponsible adventure 

Dress is becoming and a little provocative. Pos- 
ture and gait suggest light-heartedness, play, a 
challenge to some sort of frolic 

Never contemplative; to be serious is for her to be 
tedious or absurd 

Is immediately amusing and likable. Meets the 
little details of experience with a relish that is 
Catching. Strangely “secure from the contagion 
of the world’s slow stain,” and from inner aspect 
of grief and tragedy 

Reports that her skin often reacts to nylon with 
urticaria. Usually does not wear stockings 
when she is “out” for Jong periods 
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It is not possible here even to summarize the 
history of each personality that emerged and ac- 
cumulated over the months, or to describe the 
varied and multiplex complications that arose to 
tax, and often to baffle and overwhelm, the thera- 
pist’s efforts. Let us note briefly a few scattered 
items. 

In contrast with the interesting case reported 
by Erickson and Kubie (6), the secondary per- 
sonality, Eve Black, has shown anything but a 
regular desire to help the other with her problems. 
The considerably submerged and dissociated man- 
ifestations referred to by Erickson and Kubie as 
Miss Brown apparently expressed themselves only 
through the medium of automatic writing. And 
this writing was so verbally imperfect and abstruse 
that considerable interpretation or translation was 
necessary to promote even limited communication. 
Nevertheless, whatever the influence designated by 
the term Miss Brown may represent, it consistently 
worked to aid the accessible personality, Miss 
Damon. It was a therapeutic influence (6). 

Efforts to interest Eve Black in taking a similar 
role met with grim obstacles. Many of these, as 
can be imagined, were not unlike what impedes 
and frustrates the psychiatrist who tries to help 
a typical psychopath deal more constructively with 
his own problems, to find real goals and to develop 
normal evaluations. New toys or games can 
sometimes serve to arouse briefly the interest of a 
capricious child. So, too, the therapist occasion- 
ally was able to enlist Eve Black’s support in some 
remedial aim directed towards the problems of 
her body’s coinhabitant. Sometimes attaining in 
her even an attitude of neutrality was of value. 
What helpful acts or abstentions she could be 
induced to contribute have, however, been 
prompted, it seems, only by fleeting impulses such 
as casual curiosity, the playful redirection of a 
whim towards some pretty novelty. Often she 
has, by ingenious lies, misled the therapist to be- 
lieve she was cooperating when her behavior was 
particularly detrimental to Eve White’s progress. 

No real or persistently constructive or sym- 
pathetic motivation has yet been induced in the 
irresponsible Eve, but one valuable means of in- 
fluencing her is in the hands of the therapist. 
Though Eve Black has apparently been able since 
childhood to disappear at will, often doing this 
suddenly to leave the conscientious Eve with un- 
pleasant consequences of misconduct and folly 
not her own, the ability to displace Eve White’s 


consciousness and emerge to take control has 
always been limited. Sometimes she could “get 
out” and sometimes not. Since Eve White during 
treatment learned of the other’s existence it has 
become plain that her willingness to step aside 
and, so to speak, to release the imp plays an 
important part in this alternate’s ability to appear 
and express herself directly. Eve White cannot 
keep the other suppressed permanently or count 
with certainty on doing this for some given period. 
Her influence, and indirectly that of the therapist, 
have, however, been sufficiently strong to use for 
bargaining with Eve Black for better cooperation. 
If she will avoid the more serious forms of mis- 
conduct she is rewarded with more time “out.” 

Even when invisible and inaccessible she, ap- 
parently, has means of disturbing Eve White. She 
tells us she caused those severe headaches that 
brought the latter to us as a patient. Her un- 
successful struggle to get out often produces this 
symptom in the other. So too, she explains that 
the hallucinatory, or quasi-hallucinatory, voice 
which Eve White heard before the other Eve dis- 
closed herself to us was her deliberate work. 

From the two Eves during many interviews and 
from her husband and parents, we in time ob- 
tained a great deal of information about the pa- 
tient. Having concluded we had a reasonably 
complete and accurate history of her career since 
early childhood, we were astonished by the report 
of a distant relative who insisted that a few years 
before she met her present husband a previous 
marriage had occurred. 

Eve White denied this report and has never yet 
shown any knowledge of it. To our surprise Eve 
Black also maintained that we had been misin- 
formed, insisting that Eve White had married 
only once, that she herself had never and would 
never consider marrying any man. 

Finally, under the persistent pressure of evi- 
dence, Eve Black gave up her position, admitted 
that the relative’s report was correct, that she her- 
self and only she had been the bride. This event 
she told us occurred several years before Mrs. 
White’s marriage. While the other Eve was em- 
ployed in a town some distance from her parents’ 
home she had come “out” and gone to a dance 
with a man she scarcely knew. After a night of 
merriment, something was half-jokingly men- 
tioned about the pair getting married more or less 
for the hell of it. This apparently struck her 
fancy. 


—— 
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She has recounted many details of outlandish 
strife and hardship during several months when, 
apparently, she had lived with this man. No 
record of a legal union has been obtained but 
considerable evidence indicates she did cohabit 
during this period with such a man as she de- 
scribes, perhaps under the careless impression 
that a marriage had really occurred. She insists 
that some sort of “ceremony” was performed, say- 
ing that it was not formally recorded and admit- 
ting it may have been a ruse. During this time 
when she regarded herself as wed, Eve Black 
enjoyed her longest periods of uninterrupted sway. 
She was predominantly in control, almost con- 
stantly present. Apparently she had no desire for 
sexual relations but often enjoyed frustrating her 
supposed husband by denying herself to him. He 
in turn, she says, was prone to beat her savagely. 
She claims to have succeeded in avoiding most of 
the pain from this by “going in” and leaving the 
other Eve to feel the blows. 

This last claim immediately impressed us both 
as extremely implausible. If Eve White experi- 
enced the pain and humiliation of these beatings, 
why did she not remember them? She has con- 
sistently denied any memory of the entire marital 
or pseudomarital experience reported by Eve 
Black. Our unreliable but convincing informant 
maintains that she herself remained in control or 
Possession nearly all the time during this adven- 
ture. She furthermore insists that she can, by 
exerting a considerable effort, often “pick out” 
or erase from Eve White’s reach certain items of 
memory. “I just start thinking about it very 
hard,” Eve Black says, “and after a while she quits 
and it doesn’t come back to her anymore.” All 
awareness of the beatings she claims so to have 
erased from the other’s recollection. Such a 
claim, obviously, was subject to testing by the 

| therapist. Several experiments indicated that it 
is correct. 

After approximately eight months of psychi- 
atric treatment Eve White had apparently made 
encouraging progress. For a long time she had 
not been troubled by headaches or “blackout.” 
The imaginary voice had never been heard again 
since the other Eve revealed herself to the thera- 
pist. Mrs. White worked efficiently at her job 
and had made progress financially through salary 
raises and careful management. The prospect of 
returning to her husband and of working out a 
bearable relation was still blocked by serious ob- 
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stacles, but, having achieved more personal secu- 
rity and financial independence, she had become 
more hopeful of eventually reaching some accept- 
able solution. Though sadly missing the presence 
of her child, she found some comfort in her suc- 
cessful efforts to provide for her. She had made 
friends in the once strange city and with them, 
despite many worries and responsibilities, occa- 
sionally enjoyed simple recreations. 

Meanwhile Eve Black, though less actively re- 
sisted in emerging, had in general been causing 
less trouble. Being bored with all regular work, 
she seldom “came out” to make careless and 
costly errors, or indulge in complicating pranks 
while the breadwinner was on her job. Though in 
leisure hours she often got in bad company, 
picked up dates, and indulged in cheap and idle 
flirtations, her demure and conventional counter- 
part, lacking knowledge of these deeds, was spared 
the considerable humiliation and distress some of 
this conduct would otherwise have caused her. 

At this point the situation changed for the 
worse. Eve White’s headaches returned. They 
grew worse and more frequent. With them also 
returned the “blackouts.” Since the earlier head- 
aches had been related to, perhaps caused by, the 
other Eve’s efforts to gain control, and the “black- 
outs” had often represented this alternate’s peri- 
ods of activity, she was suspected and questioned. 
She denied any part or influence in the new de- 
velopment. She did not experience the headaches, 
but, surprisingly, seemed now to participate in the 
blackouts, and could give no account of what oc- 
curred during them. Apparently curious about 
these experiences, she said, “I don’t know where 
we go, but go we do.” 

Two or three times the patient was found lying 
unconscious on the floor by her roommate. This, 
so far as we could learn, had not occurred during 
the previous episodes reported by Eve White as 
“blackouts.” It became difficult for her to work 
effectively. Her hard-won gains in serenity and 
confidence disappeared. During interviews she 
became less accessible, while showing indications 
of increasing stress. The therapist began to fear 
that a psychosis was impending. Though this fear 
was not, of course, expressed to Eve White, it 
was mentioned to her reckless and invulnerable 
counterpart. The fact was emphasized that, 
should it be necessary to send Eve White to an 
institution, the other, too, would suffer the same 
restrictions and confinement. Perhaps, the thera- 
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pist hoped, this fact would curtail her in any 
unadmitted mischief she might be working. 

Since it has for long been presumed that so- 
called dual personalities arise from a dissociation 
of an originally integrated entity of functioning 
and experience, efforts were naturally exerted from 
the first to promote reintegration. Attempts were 
made with each Eve to work back step by step 
into early childhood. With Mrs. White hypnosis 
was sometimes used to regain forgotten events or 
aspects or fragments of experience. It was hoped 
that some link or bridge might be found on which 
additional contact and coalition could grow or be 
built. Under hypnosis she occasionally re-experi- 
enced considerable emotion in recalling events of 
her childhood, We have never been able to hyp- 
notize Eve Black. 

It soon became possible for the therapist to 
evoke either personality at will. During the first 
few weeks a transition from Eve White to Eve 
Black was more easily achieved by hypnosis. 
Shortly afterwards it became possible to simplify 
the procedure. Permission and the promise of 
cooperation were obtained from the lady present. 
Then the other was called by name and invited 
or encouraged to emerge. With repetition, and 
with deepening emotional relations between pa- 
tient and physician, this process became after a 
while very easily accomplished. In the very early 
stages of treatment an effort was made, perhaps 
a too naive effort, to promote some sort of blend- 
ing, or at least a liaison, by calling out both per- 
sonalities at once. To this attempt Eve White 
reacted with violent headache and emotional dis- 
stress so severe that it was not considered wise to 
continue. When the experiment was reversed, 
with the apparently invulnerable Eve Black mani- 
fest, much less agitation was observed. After one 
unsuccessful trial, however, she bluntly refused to 
go further, In explanation she said only that it 
gave her “such a funny, queer, mixed-up feeling 
that I ain’t gonna put up with it no more.” 

Sometime after the return of headaches and 
blackouts, with Eve White’s maladjustment still 
growing worse generally, a very early recollection 
was being discussed with her. The incident fo- 
cused about a painful injury she had sustained 
when scalded by water from a wash pot. As she 
spoke her eyes shut sleepily. Her words soon 
ceased. Her head dropped back on the chair. 
After remaining in this sleep or trance perhaps 
two minutes her eyes opened. Blankly she stared 
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about the room, looking at the furniture and the 
pictures as if trying to orient herself. Continuing 
their apparently bewildered survey, her eyes finally 
met those of the therapist, and stopped. Slowly, 
with an unknown husky voice and with immeas- 
urable poise, she spoke. “Who are you?” 

From the first moment it was vividly apparent 
that this was neither Eve White nor Eve Black. 
She did not need to tell us that. The thousands 
of points distinguishing the two Eves have grown 
more clear and convincing as we acquire addi- 
tional experience with each. So this new woman 
with time and study has shown herself ever more 
plainly another entity. Only in a superficial way, 
could she be described as a sort of compromise 
between the two. She apparently lacks Eve 
Black’s obvious faults and inadequacies. She also 
impresses us as far, more mature, more vivid, more 
boldly capable, and more interesting than Eve 
White. It is easy to sense in her a capacity for 
accomplishment and fulfillment far beyond that 
of the sweet and retiring Eve White, who, beside 
this genuinely impressive newcomer, appears col- 
orless and limited. In her are indications of initi- 
ative and powerful resources never shown by the 
other. This third personality calls herself Jane, 
for no particular reason she can give. In her it is 
not difficult to sense the potential or the promise 
of something far more of woman and of life than 
might be expected from the two Eves with faults 
and weaknesses eliminated and all assets com- 
bined. 

Some weeks after Jane emerged to make a 
group of three patients, electroencephalographic 
studies were conducted. 


REPORT OF ELECTROENCEPHALOGRAM 


This tracing consists of 33 minutes of contin- 

uous recording including uninterrupted intervals 
of 5 minutes or more of each personality as well 
as several transpositions. The record was made 
with a Grass Model 111 EEG machine (8 chan- 
nels) under conditions standard for this labora- 
tory. 
Each personality shows intervals of alpha 
rhythm interspersed with periods of diffuse low 
voltage fast activity. Intervals of L. V. F. are 
presumably associated with periods of mental 
tenseness, which the patient admitted experienc- 
ing. Although it is possible that these periods 
occurred at random, tenseness is most pronounced 
in Eve Black, next in Eve White and least of all 
in Jane. Several EEG’s would be needed to show 
this to be a constant relationship. 

When alpha rhythm occurs (relaxation), it is 
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steadily maintained at 10% to 11% cycles per sec. 
by Eve White and by Jane. Eve Black’s alpha is 
increased in rate of 12 or 13 cycles per sec.— 
generally at 1244, This increase is significant and 
falls at the upper border of normal limits ap- 
proaching an F1 category. It is interesting to 
note that F1 records are fairly common in psycho- 
pathic personality although no consistent corre- 
lation has yet been demonstrated. In addition to 
the increased rate there is evidence of restlessness 
and generalized muscle tension during Eve Black’s 
tracings which are not observed in the other two 
personalities. 

Transposition is effected within a few seconds. 
It is usually accompanied by artifact from eye 
movements and slight body movements, Alpha 
rhythm is frequently blocked for several seconds 
during and following transposition. Alpha block- 
ing was most pronounced in passing from Eve 
White to Eve Black. It did not occur at all in 
transposition from Eve Black to Eve White. This 
might possibly suggest that transposition from 
Eve Black to Eve White is easier to effect. How- 
ever, only two such transpositions are recorded. 

No spikes, abnormal slow waves or amplitude 
asymmetries are recognized. 


SUMMARY 


All three personalities show alternate periods of 
alpha rhythm and low voltage fast activity, pre- 
sumably due to alternate periods of mental relaxa- 
tion and mental tenseness. The greatest amount 
of tenseness is shown by Eve Black, Eve White 
next and Jane least. Eve Black shows a basic 
alpha rate of 12% cycles per sec., as compared 
with 11 cycles per sec. for Eve White and Jane. 
This places Eve Black’s tracing on the border line 
between normal and slightly fast (F1). Slightly 
fast records are sometimes (but not consistently) 
associated with psychopathic personality. Eve 
Black’s record also shows evidence of restlessness 
and muscle tension. Eve Black’s EEG is defi- 
nitely distinguished from the other two and could 
be classified as border-line normal. Eve White’s 
EEG probably cannot be distinguished from 
Jane’s—both are clearly normal. 

J. Manter, M.D. 

EEG Laboratory 
Medical College of Ga. 
Jan. 5th, 1953. 


For several months now there have been three 
patients to interview and work with. Jane has 
awareness of what both Eves do and think but 
incomplete access to their stores of knowledge and 
their memories prior to her emergence upon the 
scene. Through her reports the therapist can de- 
termine when Eve Black has been lying. Jane 
feels herself personally free from Eve White’s 
responsibilities and attachments, and in no way 
identified with her in the role of wife and mother. 
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Apparently she is capable of compassion, and, 
we feel likely, of devotion and valid love. She 
has cooperated with sincerity, and with judgment 
and originality beyond that of the others. Though 
it took her a while to learn what was quite new 
to her, she has already taken over many of Eve 
White’s tasks at work and at home in efforts to 
relieve and help her. Her feelings towards Eve’s 
little girl appear to be those of a wise and richly 
compassionate woman towards the child of a fam- 
ily not her own, but still a child in emotional 
privation. 

Her warm impulses to take a more active role 
with this little girl are complicated by the deep 
conviction that she must not in anyway act so as 
to come between the distressed mother and her 
only child. During the few months of her sepa- 
rate existence Jane has, one might say, become 
stronger and more active. Despite her fine intel- 
ligence she began without experience, or at least 
without full access to the experience of an adult. 
As time passes Jane stays “out” more and more. 
She emerges only through Eve White, never yet 
having found a way to displace Eve Black or to 
communicate through her. Almost any observer 
would, we think, find it obvious that Jane, and 
she only of the three, might solve the deepest 
problems that brought the patient we call Eve 
White to us for treatment. Could Jane remain 
in full possession of that integrated human func- 
tioning we call personality our patient would prob- 
ably, we believe, regain full health, eventually ad- 
just satisfactorily, perhaps at a distinctly superior 
level, and find her way to a happy life. 

Should this occur it seems very unlikely that 
Mr. White’s wife would ever return to him. On 
the other hand it is little more likely that Eve 
White, even if she becomes free of all that she 
has known as symptoms, could or would ever 
take up her role again as wife in that marriage. 
Should she try to do so, it is difficult to foresee 
much happiness for her or the husband. The 
probability of deep and painful conflict is appar- 
ent, also the real danger of psychosis. 

Were we impersonal arbiters in such a matter 
it would be easy to see, and to say, that the only 
practical or rational solution to this astonishing 
problem is for Jane to survive, and Jane only. 
A steadily prevailing Eve Black would indeed be 
a travesty of woman. The surface is indeed ap- 
pealing, but this insouciant and likable hoyden, 
though perhaps too shallow to become really 
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vicious, would, if unrestrained, forever carry dis- 
aster lightly in each hand. 

The sense of duty, the willingness for self- 
sacrifice, so strong and so beautiful in Eve White, 
might bring her back repeatedly into this marital 
situation which she lacks the emotional vigor to 
deal with, and in which it is not likely she could 
survive. Jane, whose integrity, whose potential 
goodness, seems not less than that of Eve White, 
has rich promise of the power to survive, even to 
triumph against odds. 

It is perhaps unnecessary to point out that we 
have not judged ourselves as wise enough to make 
active decisions or exert personal influence in 
shaping what impends. It is plain that, even if 
we had this wisdom, the responsibility is not ours. 
Would any physician order euthanasia for the 
heedlessly merry and amoral but nevertheless 
unique Eve Black? If so, it is our belief, it could 
not be a physician who has directly known and 
talked for hours with her, not one who has felt 
the inimitable identity of her capricious being. 

A surviving Jane would provide for Eve White's 
half-lost little girl a maternal figure of superb 
resources. Perhaps in time she could give the 
child a love as real and deep as that of the mother 
herself. Perhaps. But would those feelings be 


4A question of the psychotherapist’s responsibility 
has been raised. Morton Prince has been accused by 
some, particularly by McDougall, of taking too active 
a part in “squeezing out” Sally. Our experience made 
us feel very keenly the wish not to exert pressures 
arbitrarily and perhaps play a part in the extinction 
of qualities possibly of real value if they were inte- 
grated into more responsible patterns of behavior. 
We believe there is some choice open to the psychia- 
trist as to which personality he will try to reinforce, 
but that he must be tentative and work along with 
developments within the patient (or patients?) rather 
than make full and final judgments. 

We feel that therapy has played a part in the 
emergence of Jane, but we do not consider her merely 
our creation, Our influence seems to have been more 
catalytic than causal. Psychotherapy has not been 
directed according to an arbitrary plan. Although 
we have persistently investigated early experiences 
through all three manifestations of our patient, and 
have encouraged emotional reaction to them, we have 
sought to avoid insistence on any of the popular the- 
oretical forms of interpretation. 

Jane continues to grow in influence, to be out more 
and more. She has established contact with some 
events in the early life of Eve White, and seems more 
rooted in a past. We cannot predict with any great 
confidence the outcome, but we are hopeful that some 
reasonably good adjustment will work out through 
the capacities contributed by Jane. 
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the actual and unique feelings that have sustained 
the frail and tormented Eve White in her long, 
pathetic, and steadfast struggle to offer the child 
a chance for happiness? It may be said that this 
is foolish and tedious quibbling, that Jane after 
all, is the girl’s real mother. Was she not born 
of her body? All awareness of her as a daughter 
ever experienced by Eve White is recorded in the 
electrochemical patterns of Jane’s brain. True 
indeed. But is she her mother? Those who have 
known Eve White personally will find it hard to 
accept simple affirmation as the whole truth. 
What this whole truth is can be better sensed in 
direct feeling than conveyed by explanation. 

At a distance bridged only by printed or spoken 
words these “beings” may appear as factitious 
abstractions. In the flesh, though it is the flesh 
of a single body, one finds it more difficult so to 
dismiss them. Final decisions, or choices in the 
course of involuntary developments must, we have 
decided, be offered freely to something within our 
patient, perhaps to something beyond any levels 
of contact we have reached with Eve Black, with 
Eve White, or with Jane. 

Jane, who appears to have some not quite ar- 
ticulate understanding or purblind grasp of this 
whole matter, not available to either of the Eves, 
shares our sharp reluctance about participating in 
any act that might contribute to Eve White's ex- 
tinction. Unlike Eve Black, Jane has profound 
and compassionate realization of Eve White's re- 
lation to her child. The possibility, the danger, 
of a permanent loss of all touch with reality has 
occurred to Eve White. Through this we have 
found a better appreciation of her feelings as a 
mother. Too restrained ordinarily by modesty to 
speak about such a matter, after hypnosis she 
offered in quiet tones of immeasurable conviction 
to accept this extinction if it might win for her 
daughter Jane’s presence in the role she had not 
succeeded in filling adequately for her child. 

It has been said that a man must first lay down 
his life if he is to truly find it. Is it possible that 
this mother may, through her renunciation, some- 
how survive and find a way back to the one and 
dearest thing she is, for her child’s sake; ready to 
leave forever? That we do not know. Long and 
intimate personal relations with this patient have 
brought us to wonder if in her we have blindly 
felt biologic forces and processes invisible to us, 
still uncomprehended and not quite imaginable. 

Recently Eve White, anything but a physically 
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bold or instinctively active person, was challenged 
suddenly by an event, for her momentous. Of 
this Jane, deeply moved, wrote to the therapist: 


Today she did something that made me know 
and appreciate her as I had not been able to 
do before. I wish I could tell her what I feel 
but I can’t reach her. She must not die yet. 
There’s so much I must know, and so very 
much I must learn from her. She is the sub- 
stance of, this above all to thine own self be 
true. In her, too, the quality of mercy is not 
strained. 1 want her to live—not me! 

She saved the life of a little boy today. 
Everybody thought him to be her child, because 
she darted out in front of a car to pick him up 
and take him to safety. But instead of putting 
him down again, the moment his baby arms 
went around her neck, he became her baby— 
and she continued to walk down the street car- 
rying him in her arms. 

I have never been thus affected by anything 
in my four months of life. There seemed only 
One solution to prevent her possible arrest for 
kidnapping. That was for me to come out and 
find the child’s mother. In the end I had to 
give him to a policeman. Later tonight when 
she had come back out, she was searching for 
her own baby. She had her baby again for a 
short while this afternoon; and I’m so happy 
for that. I still can’t feel Eve Black. I can’t 
believe she’s just given up. J feel inexpressibly 
humble. 


DISCUSSION 


What is the meaning of the events we have 
observed and reported? Some, no doubt, will 
conclude that we have been thoroughly hood- 
winked by a skillful actress. It seems possible 
that such an actress after assiduous study and long 
training might indeed master three such roles and 
play them in a way that would defy detection. 
The roles might be so played for an hour, per- 
haps for a few hours. We do not think it likely 
that any person consciously dissimulating could 
over months avoid even one telltale error or im- 
perfection. Though this does not seem likely to 
us, we do not assume it to be impossible, Let us 
remember, too, that in plays the actors are given 
their lines, and their roles are limited to repre- 
sentations of various characters only in circum- 
scribed and familiar episodes of the portrayed 
person’s life. The actor also has costume and 
makeup to help him maintain the illusion. 

Have we, others may ask, been taken in by 
what is no more than superficial hysterical tom- 
foolery? We would not argue that the psycho- 
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pathology presented here has nothing in common 
with ordinary hysterical conversions and dissocia- 
tions. We do believe that here there is also some- 
thing more, and something different. If one is to 
regard these three manifestations of personality 
as products of disintegration, could such a pre- 
sumed disintegration be schizophrenic, or perhaps 
incompletely schizoid? If the process is akin to 
the processes of schizophrenia, it must still be 
noted that none of the three products, not one 
of the three personalities, shows anything suggest- 
ing the presence of that disorder. Are we justi- 
fied in postulating a once unified whole from 
which our three performers were split off? Or is 
it possible that the functional elements composing 
each, as we encounter them at present, have never 
in the past been really or completely unified? 

The developmental integration of what we call 
personality appears to be a complex process of 
growth or evolution, a not-too-well comprehended 
unfolding of germinal potentialities. Let us com- 
pare such a process with the zygote’s course from 
microscopic unicellular entity to adult human 
being. Reviewing the biologic course of identical 
twins we come at length to cellular unity in the 
single zygote. Perhaps we must assume in the 
multiple personalities at least a primordial func- 
tional unity. If so, is it possible that some divi- 
sion might have begun far back in the stage of 
mere potentialities, at preconscious levels of 
growth not accessible to us except in surmise 
or theory? If so, what chance is there that an 
adequate integration may occur? 

One might from .our verbal account easily see, 
or read into, the character Jane some fusion of, 
or even a mere compromise between, the diverse 
tendencies of the two Eves. If she has, indeed, 
been formed of their substance it is difficult for 
us to assume that the process was merely additive. 
Tf all her elements derive from the other two, this 
union, like that of hydrogen and oxygen to make 
water, seems to have resulted in a product gen- 
uinely different from both the ingredients from 
which it was formed. 

Have we in our many hours of enthusiastic 
work with this patient gradually lost ourselves, 
and our judgment, in an overdramatization of the 
subject? Are we reporting what is objective, or 
chiefly the verbal forms of our surmises and spec- 
ulations? It is not for us to give the final answer 
to these questions. We are aware that the only 
terms available to indicate what we think is valid 
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carry also many connotations that we do not as- 
sume or believe to be supported by fact (27). 

Obviously the differing manifestations we have 
observed in one woman’s physical organism do 
not, in all senses of the term, indicate three quite 
separate people. Our words referring to the pos- 
sible disappearance or permanent extinction of 
one of the personality manifestations perhaps 
imply we regard this as an equivalent, or at least 
an approximation, of death. Are we guilty of a 
misleading exaggeration? No heart would stop 
beating should this occur. No eyes would per- 
manently close. No fiesh would undergo corrup- 
tion. Such an extinction would not fulfill the 
criteria by which death is defined. Yet, if we 
may ask, would his immediate replacement by an 
identical twin invalidate for a bereaved widow 
the death of her husband? This analogy is not 
precise. In some respects it is misleading. It 
does not give us an answer to the question we 
raise. Perhaps it may, nevertheless, accurately 
reflect some of our perplexity. 

For these and for many other questions that 
have confronted us in this study we have no full 
or certain answers. We ask ourselves what we 
mean by referring to that which we have observed 
by such a term as multiple personality. Immedi- 
ately we face the more fundamental question: 
What is the real referent of this familiar word 
personality? In ordinary use we all encounter 
dozens of unidentical referents, perhaps hundreds 
of overlapping concepts, all with vague and elusive 
areas extending indefinitely, vaguely fading out 
into limitless implications (28). Y 

Any day we may hear that John Doe has be- 
come a new man since he quit liquor three years 
ago. Perhaps we tell ourselves that Harvard actu- 
ally made a different person of that boy across 
the street who used to aggravate all the neighbors 
with his mischievous depredations. Many re- 
ligious people describe the experience of being 
converted or born again in terms that to the 
skeptical often seem chiefly fantastic. 

With considerable truth, perhaps, it may be 
stated that after her marriage Mary Blank 
changed, that she has become another woman. 
So, too, when a man’s old friends say that since 
the war he hasn't been the same fellow they used 
to know, the statement, however inaccurate, may 
indicate something real. We hear that an ac- 
quaintance when drinking the other night was 
not himself. Another man, we are told, found 
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himself after his father lost all that money. Every 
now and then it is said that a certain woman’s 
absorption in her home and children has resulted 
in her losing her entire personality. Though such 
sayings are never taken literally, there is often 
good reason for them to be taken seriously. 

Are they not exaggerations or distortions used 
to indicate very imperfectly what is by no means 
totally untrue but what cannot be put precisely, 
or fully, into words? The real meaning of such 
familiar statements, however significant, helps us 
only a little in explaining what we think we have 
encountered in the case reported. Some relation 
seems likely, as one might say there is some rela- 
tion between ordinary vocal memory or fantasy 
and true auditory hallucinations. 

Though often distinguished from each of the 
other terms, “personality” is sometimes used more 
or less as a synonym or approximation for “mind,” 
“character,” “disposition,” “soul,” “spirit,” “self,” 
“ego,” “integrate of human functioning,” “iden- 
tity,” etc. In common speech it may be said that 
John has a good mind but no personality, or tha 
Jim has a wonderful personality but no character, 
etc. Often this protean word narrows (or broad- 
ens) in use to indicate chiefly the attractiveness, 
or unattractiveness, of some woman or man. In 
psychiatry its most specific function today is per- 
haps that of implying a unified total, of indicating 
more than “intelligence,” or “character,” more 
than any of the several terms referring with vari- 
ous degrees of exactness to various qualities, ac- 
tivities, responses, capacities, or aspects of the 
human being. In the dictionaries, among other 
definitions, one finds “individuality,” “quality or 
state of being a person,” “personal existence or 
identity.” 

There is, apparently, no distinct or whole or 
commonly understood referent for our word “per- 
sonality.” It is useful to us in psychiatry despite 
its elasticity, often because of its elasticity. If 
they are to be helpful all such elastic terms must 
be used tentatively. Otherwise they may lead us 
at once into violent and confused disagreement 
about what are likely to be imaginary questions, 
mere conflicts of arbitrary definition (14). Bear- 
ing this in mind we feel it proper to speak of Eve 
Black, Eve White, and of Jane as three “personali- 
ties.” Perhaps there is a better term available to 
indicate the manifestations of this patient. If so 
we are indeed prepared to welcome it, with en- 
thusiasm and with relief. 
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Our study has raised many questions. Even for 
us it has settled few if any. The relatively slight 
or inconclusive differences between the personali- 
ties of our patient noted electroencephalographi- 
cally, and in psychometric and projective tests, 
are not particularly impressive beside the pro- 
found and consistent differences felt subjectively 
in personal and clinical relations. A well-quali- 
fied expert examined for us the handwriting per- 
formed by each Eve. Though considerably im- 
pressed by consistent and significant differences 
between the two productions, it is his opinion that 
those with adequate professional training could 
regularly establish sufficient evidence to show 
both were done by the same human hand. After 
a detailed investigation this conclusion was ex- 
pressed by our consultant: 


As a conclusion of the opinions derived from 
analysis of the various handwritings of this mul- 
tiple personality patient, it is believed that the 
handwriting does not undergo complete subordi- 
nation to each marked change of personality, even 
though each group exhibits evidence of emotional 
instabilities. It readily appears the handwriting 
of each personality is of a different person. Such 
apparent or discernible variations may lead the 
untrained observer to believe that the handwriting 
of each personality is completely foreign to the 
other. However, extensive investigation of these 
handwriting materials establishes beyond any 
doubt that they have been written by one and 
the same individual. Nothing was found to in- 
dicate a wilful and conscious intent to disguise 
writings executed within a personality or between 
the first and second personalities. 

Ward S. Atherton, Captain, 

Military Police Corps, U.S.A. 

Chief, Questioned Document Section 

Army Provost Marshal General’s 

Criminal Investigation Laboratory 

Camp Gordon, Georgia. 


Though unable at present to add anything sig- 
nificant to the hypotheses that were offered in the 
past by those who have worked with similar pa- 
tients, we find ourselves singularly stimulated by 
our direct experience with this case. If we have 
not so far devised final or even fresh answers we 
have at least been prompted to ask ourselves a 
number of questions. A few of these, even when 
put in verbal forms outwardly familiar, we find 
to our surprise have somehow become new to us 
and peculiarly stimulating. 

Though long acquainted in a general and in- 
direct way with Morton Prince’s celebrated stud- 
ies, we both deliberately refrained for months 
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after beginning work with our case from reading 
The Dissociation of a Personality (23) and Clin- 
ical and Experimental Studies in Personality (24), 
We hoped, in this way, to avoid projecting the 
conclusions and conceptions of another into what 
we encountered. 

After having noted what is recorded here, we 
compared our experience with what Prince ob- 
served and discussed in cogent detail approxi- 
mately fifty years ago. The popular terminology 
and theory of psychiatry today differ considerably 
from the explanations and hypotheses of behavior 
offered by the physician who wrote so impres- 
sively of Miss Beauchamp and of other matters. 

Most of us believe, no doubt, that psychiatry 
and psychology have advanced marvelously since 
the turn of the century. In many respects this 
belief is unchallengeable. In many respects, yes; 
but in all? 

In this half-century of progress have we not 
also developed some habits of thinking that may 
confuse us? Have we perhaps unwittingly en- 
shrined as sacred dogma many concepts that ob- 
scure or distort more than they reveal? Long 
sanctified verbal constructs, flabby theoretical ab- 
stractions are manipulated with a bold flourish in 
many of our treatises and monographs, presum- 
ably in the name of science. In tedious polysyl- 
labic jargon we read today of electrochemical 
libidos undergoing gelatinization (75), of par- 
ental imagos cannibalistically devoured per os and 
sadistically expelled per annum (7). Such terms 
as “proved,” “so-and-so has established,” “clearly 
demonstrated,” etc. have become in our time more 
popular as synonyms for fantasy and speculation 
than Morton Prince found them (3, 7, 15, 21). 

How much can we congratulate ourselves on 
having advanced in the last fifty years if many 
of our leading authorities still find themselves 
bound to write in ponderous volumes of “actual 
neuroses” and solemnly contrast these revered 
artifacts with “psychoneuroses” (7). It is prog- 
ress if we establish the universality of castration 
fear, and its supreme significance, by redefining 
“castration” to mean all parental and social forces 
that tend to restrict or direct genital activity (5)? 
By this method any point of doctrine regarded as 
too holy for questioning could indeed be proved 
valid. But, who will say that thereby we have 
revealed anything not already well known to a 
twelve-year-old moron? So, too, we can immedi- 
ately demonstrate that all women are to a remark- 
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able extent homosexual if we piously agree that 
no impulse to activity, no courageous response, 
can be classified as other than purely masculine 
(10, 18). In recent issues of a reputable medical 
journal we read how an adult’s dream “proves” 
intrauterine emotional trauma, and demonstrates 
profound personal relations between embryo and 
placenta. The investigators warn the reader that 
“resistance” may cripple his ability to evaluate the 
plain evidence presented, may disqualify him from 
scientifically appraising these discoveries (8, 21, 
22). Is it not our responsibility as psychiatrists 
to examine frankly such developments as these 
and to ask ourselves what sort of progress we are 
making? 

Who can doubt that since the case of Miss 
Beauchamp was so carefully studied reliable 
knowledge in the field of psychiatry has accumu- 
lated. Psychologic theory, “dynamic” interpreta- 
tion of personality disorder, has moved to points 
far more ambitious than those reached by Morton 
Prince. One need not deny that much of this 
progress has been helpful, a genuine advance, to 
wonder if the movement has not also sometimes 
veered considerably from the direction of what is 
true or even plausible, and even occasionally spent 
much of itself in enthusiastic but circular expedi- 
tions about areas scarcely distinguishable from 
dianetics and other swamplands of veritable non- 
sense (72). 

Be this as it may. We suggest that further di- 
rect study of multiple personality and careful re- 
appraisal of Morton Prince’s generally neglected 
formulations may yet yield to workers in our field 
some promising clue still overlooked, a clue per- 
haps to possible discoveries that may eventually 
yield insight we need but lack today. 
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PSYCHOTHERAPY 


OF SCHIZOPHRENIA * 


FRIEDA FROMM-REICHMANN * 


When I received the invitation to talk to you 
about the psychotherapy of schizophrenia, I gave 
a good deal of thought to the question of how you 
might like me to approach the topic. Finally, I 
felt it might be most appropriate to report the 
development in the understanding and the tech- 
nique of our clinical work since 1948, when I 
had the privilege of talking to you about it at the 
schizophrenia symposium during the annual meet- 
ing in Washington. 

The goal of psychotherapy with schizophrenics 
was seen then, as it is now, as helping them, by a 
consistent dynamically oriented psychotherapeutic 
exchange, to gain awareness of the unconscious 
motivations for and curative insight into the 
genetics and dynamics of their disorder. 

As a result of the continued research which is 
inherent in dynamic psychotherapy, I have gained 
some further insight into the dynamics of schizo- 
phrenic symptomatology from which have evolved 
some variations in the details of the treatment. 
Briefly, they are as follows: 

1. The old hypothesis according to which the 
schizophrenic’s early experiences of warp and re- 
jection were of over-all significance for the inter- 
pretive understanding and treatment has been 
somewhat revised. 

2. The conflict-provoking dependent needs of 
schizophrenic patients have been seen more 
clearly. 

3. The devastating influence of schizophrenic 
hostility on the patients themselves has been un- 
derstood more clearly in connection with their 
states of autism and partial regression (weak ego 
—autistic self-depreciation). 

4. This has led to a therapeutically helpful re- 
formulation of the anxiety of schizophrenic pa- 
tients as an outcome of the universal human con- 
flict between dependency and hostility which is 
overwhelmingly magnified in schizophrenia. 


* Reprinted by permission from the American 
Journal of Psychiatry, December, 1954, Vol. III, No. 
6, 410-419. 

1The Academic Lecture read at the hundred and 
tenth annual meeting of the American Psychiatric 
Association, St. Louis, Missouri, May 3-7, 1954. 
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5. The multiple meaning of some schizophrenic 
communications and its influence on the psychia- 
trist’s interpretive endeavors have been clarified. 

Before I begin to elaborate these topics, I have 
to ask you to forgive me for lack of reference to 
publications of other workers in the field. There 
is unfortunately not time enough to comment on 
the published work of our colleagues, to indicate 
what I owe to them, and also to develop my own 
conceptions. So I felt that I ought to decide to do 
the latter. 

I would like to begin by stating that my dis- 
cussion will comprise the treatment of hospitalized 
disturbed psychotics as well as that of manifestly 
less disturbed ambulatory patients whom we treat 
in the same way through all phases and all mani- 
festations of their illness. This position is not 
new, but it has recently become more controver- 
sial because of opposite techniques which other 
authors have propagated. 

From a social and behavioral standpoint and 
from the viewpoint of the special care which 
manifestly psychotic patients may need in order 
to be protected from harming themselves and 
others, the difference between these two types of 
patients may seem tremendous. Psychodynami- 
cally speaking, I see no difference between the 
symptomatology of actively psychotic and more 
conformative schizophrenics. 

All schizophrenic patients live in a state of par- 
tial regression to early phases of their personal 
development, the disturbed ones more severely re- 
gressed than the conformative ones. All are also 
living simultaneously on the level of their present 
chronological age, the conformative ones more 
obviously so than the severely disturbed ones. Ir- 
respective of the degree of regression and dis- 
turbance, we try to reach the regressed portion 
of their personalities by addressing the adult por- 
tion, rudimentary as this may appear in some 
severely disturbed patients. Also, the general 
psychodynamic conception that anxiety plays a 
central role in all mental illnesses and that men- 
tal symptoms in general may be understood simul- 
taneously as an expression of and as a defense 
against anxiety and its underlying conflicts holds, 
regardless of the severity of the picture of illness 
and regardless of its more or less dramatic char- 
acter. Hence we make the exploration of the 
dynamic roots of the schizophrenic’s anxieties our 
potential goal through all phases of illness. 

Lack of immediate communicative responses 
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to treatment in acutely disturbed patients is no 
measuring rod for their actual awareness of and 
for their inner response to our psychotherapeutic 
approach. This old experience has been further 
corroborated in more recent dealings with several 
recovered patients. They did refer to various as- 
pects of our psychotherapeutic contacts, after their 
emergence, while we were working through the 
the dynamics of their problems, or later, while we 
were reviewing treatment and illness during the 
recovery period. 

While symptomatic psychotherapy of acute psy- 
chotic manifestations may be necessary with some 
patients for situational reasons, many of us con- 
sider it not too important to be overconcerned 
with the duration of the acutely disturbed states 
of patients while they are under psychotherapy. 

My experience during the last twenty years has 
been mainly with schizophrenic patients who came 
to our hospital in a state of severe psychotic 
disturbance, from which the majority emerged 
sooner or later under intensive dynamic psycho- 
therapy. After their emergence, they continued 
treatment with the same psychiatrist through the 
years of their outwardly more quiet state of ill- 
ness, with the aim of ultimate recovery with in- 
sight. During both phases the patients were seen 
for four to six regularly scheduled interviews per 
week, lasting one hour or longer. Sometimes 
relapses occurred. Such relapses were due to 
failure in therapeutic skill and evaluation of the 
extent of the patient’s endurance for psychother- 
apy, to unrecognized difficulties in the doctor- 
patient relationship, or to responses to intercur- 
rent events beyond the psychiatrist’s control. As 
a rule, these relapses could be handled successfully 
if the psychiatrist himself did not become too 
frightened, too discouraged, or too narcissistically 
hurt by their occurrence. 

From the experience with these patients we 
learned about one more reason for advocating the 
same type of psychotherapeutic approach through 
all phases of the illness: part of the work which 
a patient has to accomplish during treatment and 
at the time of his recovery is, in my judgment, to 
learn to accept and to integrate the fact that he 
has gone through a psychotic illness and that there 
is a “continuity,” as one patient called it, between 
the person as he manifested himself in the psy- 
chosis and the one he is after his recovery. The 
discussion of the history of patients’ illness and 
treatment after their recovery serves, of course, 
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the same purpose. This is in contrast to the 
therapeutic attitude of some psychiatrists who hold 
that recovering patients should learn to detest and 
eject their psychotic symptomatology, like a for- 
eign body, from their memory. 

The difficult task of integrating the psychotic 
past, which we advocate, will be greatly facilitated 
if it can be done on the basis of the patient’s 
confidence in a psychiatrist who has maintained 
the same type of psychotherapeutic relationship 
with him throughout the course of treatment. 
Changes in the doctor’s therapeutic approach may 
easily become a mirror of the lack of continuity 
in the patient’s personality and, incidentally, may 
become an inducement for the patient to dwell in 
one or another phase of his illness, depending 
upon his preference for this or another type of 
therapeutic relationship. 

The following experience with a patient illus- 
trates the difficulties of integrating the experi- 
ence of a past psychosis: 

This patient emerged from a severe schizo- 
phrenic disturbance of many years’ duration, for 
which she was finally hospitalized for two years 
at Chestnut Lodge and then treated as an ambu- 
latory patient for another two years. Eventually 
she became free of her psychotic symptomatology 
except for the maintenance of one manifest symp- 
tom: she would hold on to the habit of pulling 
the skin off her heels to the point of habitually 
producing open wounds. No attempt at under- 
standing the dynamics of this residual symptom 
clicked, until the patient developed one day an 
acute anxiety state in one of our psychotherapeutic 
interviews in response to my commenting on fa- 
vorable “changes” that had taken place in her. 
After that, the main dynamic significance of the 
skin-pulling became suddenly clear to her and 
to me. “I am still surprised and sometimes a lit- 
tle anxious about the change which I have under- 
gone,” she said, “and about finding and maintain- 
ing the continuity and the identity between the 
girl who used to be so frightfully mixed up that 
she had to stay locked up on the disturbed ward 
of Chestnut Lodge, and the popular and academi- 
cally successful college girl of today.” The skin- 
pulling as a symptom similar to another self- 
mutilating act of burning herself, which she re- 
peatedly committed while acutely ill, helped her 
to maintain her continuity. It made it possible to 
be ill and well at the same time, because it was 
only she who knew about the symptom which 
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could be hidden from everybody else with whom 
she came in contact as a healthy person. After 
this discovery, the symptom eventually disap- 
peared. 

Incidentally, important as the understanding of 
this one dynamic aspect of the patient’s symptom 
was for therapeutic reasons, this does not mean 
that it constituted its only significance. 

It was stated that mental symptoms in general 
can be understood as a means of expressing and 
of warding off anxiety and the central conflicts 
which are at the root of this anxiety and that 
the exploration of this anxiety is most important 
in psychotherapy with schizophrenics. If this is 
true, we have to ask for a specific psychodynamic 
formulation of the causal interrelatedness between 
schizophrenic symptomatology and the conflicts 
underlying the anxiety in schizophrenic patients. 
A correct workable conception of the psychody- 
namic correlation between anxiety and schizo- 
phrenic symptom formation is a prerequisite for 
the development of a valid method of dynamic 
psychotherapy with schizophrenic patients. 

We know the historically determined deadly 
fear of schizophrenics of being neglected, rejected, 
or abandoned and their inability to ask for the 
acceptance and attention they want. Conse- 
quently, most psychiatrists who did psychotherapy 
with schizophrenics in the early days suggested 
treating them with utter caution, as I did, or with 
unending maternal love, permissiveness, and un- 
derstanding, as did Schwing and, more recently, 
Sechehaye. While doing so, psychiatrists faced 
another dynamically significant problem of the 
schizophrenic—the unconscious struggle between 
his intense dependent needs and his recoil from 
them. These we learned to understand geneti- 
cally as the correlate to the patients’ experience 
of neglect by the “bad mother” at a time when 
her attention was indispensable for the infant’s 
and the child’s survival. 

We also know about the resentment, anger, 
hostility, fury, or violence, with which the infant 
and child—the “bad me,” as Sullivan called it— 
who later becomes the schizophrenic patient re- 
sponds to the early damaging influences of the 
“bad mother,” as he experienced her. 

In order to understand the devastating signifi- 
cance of this hostility for schizophrenic patients, 
we have to realize the following developmental 
facts of their lives. As we first learned from 
Freud and Bleuler, schizophrenics are people who 
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have responded to their early misery in interper- 
sonal contacts not only with anger and hostility 
but also with a partial regression into an early 
state of ego development and of autistic self- 
concern and self-preoccupation. This early trau- 
matization and the partial regression make for a 
weak organization of the schizophrenic’s ego. 
Consequently, he feels more threatened than other 
people by all strong emotional experiences and, 
above all, by the realization of his own hostile 
impulses. 

Another reason for the specific hardship which 
schizophrenic hostility creates for the patients is 
that their autistic self-preoccupation makes them 
painfully concerned with their own “bad me,” 
with their own hostility and fury, or their fanta- 
sies of violence and destruction against themselves 
and others. 

Besides, their grandiose concept of power in 
these states of regression to an early state of in- 
terpersonal development makes for their preoccu- 
pation with themselves as more or less dangerous 
people. 

Where other types of patients are mainly con- 
cerned with the fear of disapproval, of the with- 
drawal of love which they may elicit in other peo- 
ple by their hostile impulses or other emanations 
of their “bad me,” schizophrenic patients are more 
concerned with their own status as dangerously 
hostile people, with the damage which may be 
done to others who associate with them, and with 
their impulses of punitive self-mutilation. 

Yet neither the fearful and grandiose preoccu- 
pation with his dangerous hostility nor the threat 
of the primary abandonment by mother nor the 
resulting dependent needs from which the pa- 
tient simultaneously recoils nor the secondary re- 
jection he may have elicited in the mother and 
other significant persons in his environment be- 
cause of his “badness” is in itself potent enough 
to elicit schizophrenic anxiety. 

Schizophrenics suffer, as all people in our cul- 
ture do, even though to a much lesser degree, 
from the tension between dependent needs and 
longing for freedom, between tendencies of cling- 
ing dependence and those of hostility. For the 
above-mentioned reasons, the degree of the schizo- 
phrenic’s need for dependency, the extent to which 
he simultaneously recoils from it, and the color 
and degree of his hostile tendencies and fantasies 
toward himself and others are much more intense 
than in other people. As a result, the general ten- 
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sion engendered by the clash of these powerful 
emotional elements becomes completely over- 
whelming. In other words, the quantitative dif- 
ference between the schizophrenic’s anxiety and 
similarly motivated tensions in people who have 
not been emotionally traumatized as early in life 
as the schizophrenic and who could therefore de- 
velop a stronger ego organization is so great that 
it acquires a totally different quality. It is this 
tremendous volume of the schizophrenic’s anxiety 
that makes it unbearable in the long run. It then 
has to be discharged by symptom formation; i.e., 
schizophrenic symptomatology is seen as the ex- 
pression of and defense against schizophrenic anx- 
iety, engendered by the tremendous tension be- 
tween his great dependent needs, his fear of giv- 
ing them up, his recoil from them, his hostility, 
and his fantasies of destructiveness against him- 
self and others. 

In delineating the dynamic interrelatedness be- 
tween schizophrenic anxiety and symptomatology, 
I do not claim, of course, to solve the total prob- 
lem of schizophrenic symptomatology. I am re- 
ferring only to such portions of the dynamics as 
seem necessary for the clarification of my thera- 
peutic conceptions. Our treatment of many 
schizophrenic manifestations has been corrected 
or markedly improved in the light of the hypoth- 
esis offered. 

Take, for example, the meaning of the schizo- 
phrenic’s “fear of closeness,’ a formulation which, 
incidentally, has been much abused. In the early 
years of psychotherapy with schizophrenics we 
used to understand this fear of intimacy as an ex- 
pression of anxiety that all closeness, much as it 
was simultaneously desired, might be followed by 
subsequent rejection; then we learned that this 
fear of closeness seemed also strongly determined 
by the fear which the partially regressed schizo- 
phrenic, with his weak ego organization, felt, that 
closeness might endanger his identity, might de- 
stroy the boundaries between his own ego and 
that of the other person. 

In the meantime, I learned from my work with 
quite a number of further patients that their fear 
of closeness is tied up with their anxiety regarding 
the discovery of their secret hostility or violence 
against persons for whom they also feel attach- 
Ment and dependence. They give a mitigated, 
non-dangerous expression to this hostility and try 
simultaneously to hide it as a secret by staying 
away from people. 
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Let me mention, in this context, an experience 
which I had repeatedly with patients whom I saw 
in an office connected with my home: they be- 
came tense and anxious when we met after my 
secretary and maid had left the house, The pa- 
tients commented on the lack of protection against 
their hostile impulses. 

One young paranoid patient formulated this 
outrightly, by asking, “Do you realize that I can 
knock you down in no time?” Unfortunately, I 
became preoccupied with my role of demonstrat- 
ing the lack of fear which at the time was luckily 
mine. Thus I failed to notice how frightened 
the patient felt by the realization of his potential 
violence against a woman-doctor, with whom he 
had established at the same time a dependent re- 
lationship. Later on I realized that he was warn- 
ing me against and asking for protection from 
future acts of violence, by which he felt we were 
both threatened. Subsequently, such threats 
against me or other doctors whom he acciden- 
tally saw in my house, against the house itself, 
and against the attendants who came to take care 
of him were the unfortunate result. All these 
assaultive acts were accompanied by marked signs 
of anxiety. 

I continued seeing the patient in a wet pack, 
until he agreed to abstain from all violent actions 
and to express his hostile feelings verbally. This 
he did for some time, alternately with verbal ex- 
pressions of his dependent attachment and with 
non-verbal signs of anxiety, until he developed 
a marked manifest psychotic symptomatology. 
After that, it became more difficult to have the 
patient face his dependent needs and his hos- 
tility or the anxiety engendered by both. Had I 
caught on immediately to the patient’s anxiety re- 
garding his own hostility, he might have been 
spared the necessity of transforming it into overt 
psychotic symptomatology. 

Let us now take a look at states of catatonic 
stupor in the light of our hypothesis. I believe it 
is of interest to state that many clinicians have 
been accustomed to describe stuporous states as 
a result of the schizophrenic’s withdrawal of in- 
terest from outward reality. Hence the over- 
simplification of interpreting them only as a re- 
sponse to catatonic fear of rejection becomes 
quite understandable. 

Actually, a patient in stupor has not withdrawn 
his interest from the environment. As we know 
from reports about the experiences while in stu- 
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por, which these patients furnish after their emer- 
gence, they are, more frequently than not, keen 
observers of what is going on in their environ- 
ment. Withdrawal of the ability for interpersonal 
communication is what characterizes the condi- 
tion of the patient in stupor, not withdrawal of 
interest in the environment per se. As we know 
now, this comes about not only in response to the 
threat of rejection by others but much more for 
fear of the patient’s own hostility or violence in 
response to actual or assumed acts of rejection 
from other people. 

I remember in this connection the catatonic 
patient previously reported who became stuporous 
when she did not receive my message that I had 
to postpone a scheduled interview. Upon discov- 
ering this unfortunate omission, I painstakingly 
explained the situation to the patient. When she 
heard and understood me, she emerged from the 
stuporous state, and psychotherapeutic contact 
could be resumed. 

Incidentally, while telling you about my thera- 
peutic approach to this or other patients, I have 
to fight off a temptation to dramatize—this, in 
spite of the fact that dramatization certainly is 
not in accord with what I would consider good 
taste in delivering a scientific paper. Upon asking 
myself about the reason for this temptation, I dis- 
covered that actually it is not so illegitimate as it 
appears to be. It is promoted by the fact that I 
feel inclined to duplicate tone and inflections of 
the patient's and my voices, the concomitant ges- 
tures, changes in facial expression, etc. This 
comes about because the doctor’s non-verbal con- 
comitants of the psychotherapeutic exchange with 
schizophrenic patients, in and outside manifestly 
psychotic episodes, are equally, if not at times 
more, important than the verbal contents of our 
therapeutic communication. 

The particular emotional stimulus to which a 
stuporous schizophrenic will respond, which in- 
stigated this digression, must be much stronger 
than one that can be produced by the content per 
se of what is said. An academic type of delivery 
to the patient will not do the trick. 

Of course, to a certain extent non-verbal ele- 
ments play a great role in all interpersonal com- 
munications, but the degree of expressive skill 
with which the patient himself uses means of non- 
verbal communication and his specific sensitivity 
to the meaning of its use by the psychotherapist 
are such that, for all practical purposes, the dif- 
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ference in quantity, here again, turns actually into 
one of quality. 

This great perceptive sensitivity of schizophrenic 
patients was one of the reasons for my overcau- 
tious approach to them in bygone times. We 
used to look at the sensitiveness of these patients 
in a merely descriptive way and labeled it as one 
of their admirable characteristics. If we investi- 
gate it psychodynamically, we realize that it de- 
velops actually in response to their anxiety as a 
means of orientation in a dangerous world, and 
we can use it as a signpost on our road toward 
the psychodynamic investigation of schizophrenic 
anxiety. Also we should not overlook the possi- 
bility that many of the initially correct results of 
the schizophrenic’s perceptive sensitivity may be 
subsequently subject to distorted psychotic inter- 
pretation and misevaluation. 

To return to our discussion of the psychody- 
namics of states of catatonic stupor, I too used to 
interpret them as a sign only of the patients’ hav- 
ing withdrawn because of the lack of considera- 
tion or rejection of them. I believe now that this 
is neither the primary nor the only cause and that 
withdrawal into stupor is more strongly motivated 
by the anxiety of patients who realize the danger 
of their own hostile responses to such neglect by 
people on whom they depend and to whom they 
are attached. Several patients corroborated the 
validity of this hypothesis by spontaneous com- 
ments after their recovery. 

The symptoms that patients in stupor show con- 
comitant with their withdrawal of interest from 
communication furnish another proof. Stuporous 
patients regress to a period of life when they used 
food intake and elimination as an expression of 
their hostility against and of their wish to exert 
control over their environment. 

The hostile meaning of disturbances in elim- 
ination can also be demonstrated outside stupor- 
ous states. I had impressive proof of it in my 
dealings with a schizophrenic woman patient, who 
is also mentioned in the Stanton and Schwartz 
paper, “A Social Psychological Study of Incon- 
tinence.” 

One day, this patient urinated, before I came 
to see her, on the seat of the chair on which I 
was supposed to be seated during our interview: 
I did not see that the chair was wet. The patient 
did not warn me, and I sat down. I became aware 
of the situation only after the dampness had pene- 
trated my clothing. I thereupon expressed my 
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disgust in no uncertain terms. Then I stated that 
I had to go home. The patient asked anxiously 
about my coming back, which I refused with the 
explanation that the time allotted to our inter- 
view would be over by the time I had taken a 
bath and attended to my soiled clothes. 

Obviously, the patient’s wetting my chair was 
an expression of hostile aspects in her dependent 
relationship with me. However, I did not say so 
in so many words, because I felt that the verbali- 
zation of this insight should come from the pa- 
tient. In subsequent discussions of the event, she 
responded first with symptom formation and non- 
verbal communication, wavering back and forth 
from expressions of hostility against me to expres- 
sions of attachment and dependence, until she was 
finally able to reveal that this had been a planned 
expression of resentment against me. The patient 
wished to punish me for what she had experienced 
as excessive therapeutic pressure during an inter- 
view preceding the chair-wetting. 

Certain symptoms of several hebephrenic pa- 
tients of our observation could also be psycho- 
dynamically understood and therapeutically ap- 
proached as an expression of the anxiety con- 
nected with their hostility toward people on whom 
they likewise felt extremely dependent. These 
patients withdrew their interest from their inter- 
personal environment except for a kind of tolerant 
and peaceful, if incomprehensible, give-and-take 
with some of their fellow patients, until it was all 
suddenly interrupted by an outburst of hostility 
against these patients or against the personnel. As 
far as their dealings with me went, they did what 
hebephrenic patients will do at times, as we all 
know: a kind of mischievous smile or laughter 
accompanied or interrupted their scarce commu- 
nications or was in itself the only sign of their 
being in some kind of contact with me. Two 
patients stated, after they were ready to resume 
verbal contacts with me, that their laughter was 
a correlate of hostile derogatory ideas against and 
fantasies about me. As they at last established a 
close relationship of utter dependence upon me, 
this was accompanied by a marked increase in 
intensity and duration of these spells of deroga- 
tory, tense laughter. The anxiety connected with 
the establishment of a dependent relationship ex- 
pressed itself and was warded off by the increased 
derogatory laughter. The laughter subsided even- 
tually, in response to the psychotherapeutic in- 
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vestigation and the working through of the vari- 
ous aspects of the patients’ relationship with me. 

With regard to paranoid patients, one of their 
dynamisms is, as we know, that they project onto 
others the blame for what they consider blame- 
worthy in themselves. Upon investigation of the 
contents of their blameworthy experiences, we 
always discover that they are extremely hostile 
in nature. The suspiciousness of, these people 
points in the same direction. 

Again, their suspicion and hostility increase 
parallel with the realization of their friendly de- 
pendent relationship with the psychiatrist. This 
showed quite impressively in the above-mentioned 
violent man-patient. The fact that the office 
where we initially met was part of my home be- 
came to him, to use Mme. Sechehaye’s expression, 
a “symbolic realization” of his wish to be my 
friend and houseguest. As he fantasied that I 
shared his wishes and hallucinated that he heard 
me say so, he became more and more hostile and 
anxious. 

If our hypothesis about the interrelatedness be- 
tween craving for and recoiling from dependency, 
dangerous hostility and violence against them- 
selves and others, overwhelming anxiety, and 
schizophrenic symptomatology is correct, we must 
ask how the therapeutic approaches of consistent 
love and permissive care, as they used to be given 
to schizophrenic patients by some therapists, in- 
cluding myself, could be helpful. We used to 
think that they were successful (1) because they 
gave a patient the love and interest he had missed 
since childhood and throughout life; (2) because 
his hostility could subside in the absence of the 
warp which had originated it; and (3) because 
the patient was helped to re-evaluate his distorted 
patterns of interpersonal attitudes toward the real- 
ity of other people. 

We now realize that what we have long known 
to be true for neurotic patients also holds true for 
schizophrenics. The suffering from lack of love 
in early life cannot be made up for by giving the 
adult what the infant has missed. It will not have 
the same validity now that it would have had 
earlier in life, Patients have to learn to integrate 
the early loss and to understand their own part 
in their interpersonal difficulties with the signifi- 
cant people of their childhood. 

J also know now, and can corroborate this with 
spontaneous statements of recovered patients, that 
the love and consideration given to them is thera- 
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it as proof that they are not so bad, so hostile, in 
the eyes of the therapist, as they feel themselves 
to be. 

The few fragments of therapeutic exchange with 
patients quoted so far may serve as examples of 
the change in our psychotherapeutic attitude, part 
of which I have already elaborated in my con- 
tribution to the 1950 Yale Symposium on Psycho- 
therapy with Schizophrenics. 

Of course, we give our schizophrenic patients 
all the signs of empathic consideration that they 
need because they suffer. If possible, we prefer 
to do so by implication or in non-verbalized in- 
nuendoes. Too marked sympathetic statements 
may enhance fear of intimacy and may unneces- 
sarily increase patients’ dependence on the thera- 
pist, putting into motion the psychopathological 
chain of dependent attachment, resentment, anx- 
iety, symptom formation. 

However, we no longer treat the patients with 
the utter caution of bygone days. They are sensi- 
tive but not frail. If we approach them too cau- 
tiously or if we do not expect them to be poten- 
tially able to discriminate between right and 
wrong, we do not render them a therapeutically 
valid service, We contribute to their low self- 
evaluation, instead of helping them to develop 
a healthier attitude toward themselves and others. 

Also, if there was lack of parental interest in 
infancy, this entails lack of guidance in child- 
hood. This fact deserves more therapeutic con- 
sideration than it has been given so far. There 
are therapeutically valid variations of the guid- 
ance needed and missed in early childhood, which 
can be usefully included in psychotherapy with 
schizophrenics in adulthood. 

One exuberant young patient, the daughter 
of indiscriminately “encouraging” parents, was 
warned against expecting life to become a garden 
of roses after her recovery. Treatment, she was 
told, should make her capable of handling the 
vicissitudes of life which were bound to occur, 
as well as to enjoy the gardens of roses which 
life would offer her at other times. When we 
reviewed her treatment history after her recov- 
ery, she volunteered that this statement had helped 
her a great deal, “not because I believed for a 
moment that you were right, Doctor, but because 
it was such a great sign of your confidence in me 
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and your respect for me, that you thought you 
could say such a serious thing to me and that I 
would be able to take it.” 

In line with our attempts at raising patients’ 
low opinions of themselves, we replace offers of 
interpretation by the therapist, if possible, by at- 
tempts at encouraging patients to find and formu- 
late their interpretations themselves, as demon- 
strated in my exchange with the patient who wet 
the chair. 

So far we have discussed the psychodynamics 
of schizophrenics’ symptom formation in general 
as a response to their anxiety. Let us now con- 
sider the double and multiple meaning that is 
inherent in many of the schizophrenic’s cryptic 
and distorted manifestations. Many of them 
elude the psychiatrist's understanding, but they 
may yield indirectly to therapeutic endeavors in 
other areas. Insight into their dynamics may thus 
be gained in subsequent discussions. 

Others, such as hallucinations and delusions, I 
found frequently accessible to a direct psycho- 
therapeutic approach. They would be success- 
fully examined with the patient as they occurred 
in his experience and in terms of his own formu- 
lations. I stated, however, explicitly to the patient 
that I did not share his hallucinatory or delu- 
sional experience. 

There is one more access to understanding 
schizophrenic communications which has not yet 
been mentioned. Schizophrenics are able to re- 
fer in their productions simultaneously to experi- 
ences from the area of their early childhood, from 
their present living in general, and, if they are 
under treatment, from their relationship with the 
therapist, as dreamers do in their dreams. Some- 
times we are able to understand the meaning of 
and their reference to various chronological levels 
of experience, sometimes not. 

At any rate, it is most important for the psy- 
chiatrist to realize this multiple meaning of many 
schizophrenic symptoms and communications. 
This realization should make us replace the old 
therapeutic attitude that therapists ought to be 
able to find and offer to the patient the only cor- 
rect meaning of a symptom or communication by 
the suggestion that they should train themselves 
to become able to feel which of several meanings 
of a schizophrenic symptom or communication (if 
they catch on to several of them) is the therapeu- 
tically most significant one at a given time. This 
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ability of the psychiatrist to select sensitively when 
and what to present to the patient is most de- 
sirable, because of the narrowed ways of the 
schizophrenic’s thinking and his short span of 
attention, which limits his capacity to listen. 

The insights into the possibilities and the limi- 
tations of understanding schizophrenic communi- 
cations should do away with the endless discus- 
sion that used to go on between various members 
of groups of psychotherapists as to whether a pa- 
tient’s communication in word or action meant 
only what Dr. A heard or exclusively what Dr. B 
heard. Depending upon the scope of personal 
and clinical experience and the personality of the 
therapist and on his ability to understand patients’ 
communications via identification, each among 
several psychotherapists may catch on to one of 
the different meanings of a patient’s communica- 
tion. 

The insight into the manifold meanings of pa- 
tients’ symptoms or other manifestations may also 
do away with the continuing discussions in our 
literature of the question whether or not schizo- 
phrenic patients understand their own communi- 
cations. I believe it should be stated that they 
sometimes do and sometimes do not, Sometimes 
they may, above all, be aware of the descriptive 
content of their communication but not of its 
dynamic significance. While this whole question 
holds great theoretical interest, I believe now that 
its solution is not too important for therapeutic 
purposes. This holds true all the more, since the 
main trends in treatment no longer go in terms of 
translating the descriptive meaning of the content 
of any single symptom. 

There are two facts that have led us more and 
more away from working with patients in terms 
of interpreting their various symptoms and other 
cryptic communications. One is negative and is 
determined by the fact that most isolated interpre- 
tations of the content of a single symptom or 
other communication will not cover all its mean- 
ings in a therapeutically significant way. The 
other is an important positive one: it follows from 
the knowledge of the psychodynamic fact that 
schizophrenic patients, like any other mental pa- 
tients under treatment, repeat with the therapist 
the interpersonal experiences which they have un- 
dergone during a lifetime. 

Hence we have moved increasingly in the di- 
rection which I have already elaborated in previ- 
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ous papers: we make the therapeutic exploration 
and clarification of schizophrenic anxiety and 
symptomatology, as they manifest themselves in 
the patient-doctor relationship, as integral a part 
of psychotherapy with schizophrenics as it is with 
neurotic patients. Some modifications are, of 
course, required in view of the difference between 
schizophrenic and neurotic modes of relatedness 
with the psychiatrist and with other people. But 
in both cases our therapeutic attention is focused 
on the dynamic investigation and clarification of 
the conscious and the unconscious aspects of the 
patient-doctor relationship in its own right and in 
its transference aspects. Special attention is paid 
to the exploration of the anxiety aroused by the 
therapist's probing into the patients’ problems and 
to their security operations against it. 

Here is an example from the treatment history 
of the patient who pulled the skin off her heels, 
which illustrates both the multiple meaning of 
schizophrenic symptoms on various experiential 
levels and our approach to its basic dynamic sig- 
nificance in terms of investigating its manifesta- 
tions in the patient-doctor relationship: 

We are already familiar with the dynamic valid- 
ity of the skin-pulling as a way for the patient to 
establish her “continuity.” As we learned in the 
course of its further investigation, the localization 
of this symptom was determined by mischievously 
ridiculing memories of her mother’s coming home 
from outings to prepare a meal for the family, 
going into the kitchen, removing shoes and stock- 
ings but not coat and hat, and walking around the 
kitchen on bare feet. 

The self-mutilating character of the symptom 
proved to be elicited by the patient’s resentment 
against me. In her judgment, I misevaluated the 
other act of self-mutilation from which she suf- 
fered during her psychotic episodes, the compul- 
sion to burn her skin. The patient thought of it 
as a means of relieving unbearable tension, where- 
as she felt that I thought of it only as a serious 
expression of tension. In maintaining the skin- 
pulling, while otherwise nearly recovered, she 
meant to demonstrate to me that skin injuring 
was not a severe sign of illness. 

During the treatment period after the dismissal 
from the hospital, the patient tried for quite a 
while to avoid the recognition of her hostility 
against me and the realization of her dependent 
attachment to me, which she resented, by trying 
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to cut me out of her everyday life. She did so, 
repeating an old pattern of living in two worlds, 
the world which she shared with me during our 
therapeutic interviews and her life outside the in- 
terviews, during which she excluded me completely 
from her thinking. Previously, the patient had 
established this pattern with her parents by living 
for eleven years in an imaginary kingdom which 
she populated by people of her own making and 
by the spiritual representations of others whom 
she actually knew. They all shared a language, 
literature, and religion of her own creation. 
Therapeutic investigation taught us that the pa- 
tient erected this private world as a means of 
excluding her prying parents from an integral 
part of her life. It was her way of fighting her 
dependence on them and of demonstrating how 
different she was from them in all areas where she 
disliked and resented them. 

The patient recognized the significance of the 
dichotomy in her dealings with me as a means of 
escape from her resentment against and depend- 
ence on me, only after going twice through a sud- 
den outburst of hostility and anxiety which led 
to brief periods of readmission to the hospital, 
where she regressed to her old symptom of burn- 
ing herself. 

After a few stormy therapeutic interviews, she 
understood the dynamic significance of her need 
for readmission; she felt so dependent on me and 
so hostile against me that she had to come back 
to live in the hospital and to burn her skin. 

During the ambulatory treatment periods which 
followed, the patient lear.ıed eventually to recog- 
nize that her excluding me from one part of her 
life was a repetition of the exclusion of her par- 
ents from her private kingdom. After that, she 
saw, too, that her resentment against me was also 
a revival of an old gripe against her parents; 
they had a marked tendency to make her out to 
be dumb, as I tried to do, in her judgment, by 
inflicting upon her my misevaluation of the skin 
burning. They kept her for many years in a state 
of overdependence, as I had done, too, by virtue 
of our therapeutic relationship. 

All these transference facets of the patient’s 
relationship with me, as well as the problems of the 
doctor-patient relationship in their own right, had 
to be worked through several times before the 
patient could ultimately become free from her 
interpersonal difficulties with me, with her parents, 
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and with other people and from the anxiety which 
they engendered. 

While we consider the suggestions about psycho- 
therapy with schizophrenics, which we have of- 
fered, to be psychodynamically valid and helpful 
rules, we believe, on the other hand, that the ways 
and means to go about using them will inevitably 
be subject to many variations, depending on the 
specific assets and liabilities of the personality 
of the therapist and hence on the specific coloring 
of his interaction with his patient. 

Psychotherapy with schizophrenics is hard and 
exacting work for both patients and therapists. 
Every psychiatrist must find his own style in his 
psychotherapeutic approach to schizophrenic pa- 
tients. About technical details such as seeing 
patients only in the office, walking around with 
them, seeing them for non-scheduled interviews I 
used to have strong feelings and meanings. Now 
I consider them unimportant, as long as the psy- 
chotherapist is aware of and alert to the dynamic 
significance of what he and the patient are doing 
and what is going on between them. What mat- 
ters is that he conduct treatment on the basis of 
his correct appraisal and exploration of the psy- 
chodynamics of the patient’s psychopathology and 
its manifestations in the doctor-patient relation- 
ship. Successful histories of treatment with the 
principles suggested, but conducted in various and 
sundry interpersonal and environmental settings, 
are living proof of the validity of my present cor- 
rected attitude. 

Since the work with schizophrenics makes great 
and specific demands on the psychiatrist’s skill 
and endurance, no discussion of psychotherapy 
with schizophrenics is satisfactory as long as the 
consideration of the specific personal problems 
of the therapist is omitted. In view of the exten- 
sive previous discussions of this topic by others 
and by myself, I shall only briefly enumerate the 
specific problems and requirements which ought 
to be met and solved by psychiatrists who wish 
to work with schizophrenics: they should be able 
to realize and constructively handle unexpected 
emotional responses, such as fears or anxieties, at 
times inevitably aroused in each of them by anx- 
ious, violent, overdependent, or lonely schizo- 
phrenic patients. 

There is one special point I might add. Psy- 
chotherapists who share the fear of loneliness, 
which is the fate of men in our time, must watch 
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out specifically lest their need to counteract their 
own loneliness make them incapable of enduring 
the inevitable loneliness and separation that their 
schizophrenic patients may bring home to them in 
their isolating cryptic communications. An un- 
desirable urge to translate cryptic schizophrenic 
communications prematurely may interfere in 
such therapists with the more sound tendency to 
wait patiently and listen to the patient's own ex- 
planations of their communications. 


SUMMARY 


1. The goal of dynamic psychotherapy with 
schizophrenics is the same as that of intensive 
psychotherapy with other mental disturbances, 
i.e., to help both ambulatory and hospitalized pa- 
tients gain awareness of and curative insight into 
the history and unknown dynamic causes respon- 
sible for their disorder. 

2. The same type of psychotherapeutic ap- 
proach to schizophrenic patients during all phases 
and manifestations of the disorder and discussions 
of illness and treatment after their recovery are 
recommended for the purpose of helping such pa- 
tients to integrate their recovery with their psy- 
chotic past. 

3. An attempt is made to understand schizo- 
phrenic symptomatology and to approach it thera- 
peutically as an expression of and a defense 
against anxiety. The hypothesis is offered that 
the universal human experience of tension be- 
tween dependency, fear of relinquishing it, recoil 
from it, and interpersonal hostility becomes, in 
the case of schizophrenic persons, SO highly mag- 
nified and so overwhelming that it leads to un- 
bearable degrees of anxiety and then to discharge 
in symptom formation. 

4, The multiple meaning of many schizo- 
phrenic symptoms, communications, and other 
manifestations has been discussed. The need for 
understanding and translating them descriptively 
for therapeutic reasons has been questioned, and 
the significance of non-verbal communications 
with schizophrenic patients has been stressed. 

5. Psychodynamic investigation and clarifica- 
tion of schizophrenic anxiety and symptomatology 
in its conscious and unconscious manifestations 
in the patient-psychiatrist relationship is presented 
as being equally crucial for psychotherapy with 
schizophrenics as for other mental patients. 


APPLICATION OF OPERANT 
CONDITIONING TO REINSTATE 
VERBAL BEHAVIOR 

IN PSYCHOTICS * + 


Wayne Isaacs, JAMES THOMAS 
AND IsRAEL GOLDIAMOND * 


In operant conditioning, behavior is controlled 
by explicitly arranging the consequences of the 
response, the explicit consequence being termed 
reinforcement. For example, a lever-press by a 
rat activates a mechanism which releases food. 
If the rat has been deprived of food, lever-pressing 
responses will increase in frequency. If this rela- 
tionship between food and response holds only 
when a light is on, the organism may discriminate 
between light on and light off, that is, there will 
be no lever-pressing responses when the light is 
turned off, but turning it on will occasion such 
responses. From this simple case, extensions 
can be made to more complicated cases which 
may involve control of schedules of reinforce- 
ment. These procedures have recently been ex- 
tended to the study of psychopharmacology (5), 
controlled production of stomach ulcers (4), ob- 
taining psychophysical curves from pigeons (3), 
conditioning cooperative behavior in children (2), 
programming machines which teach academic 
subjects (77), analyzing the effects of noise on 
human behavior (/), and decreasing stuttering 
(7), to mention a few examples. 

The following account is a preliminary report 


* Reprinted by permission from the Journal of 
Speech and Hearing Disorders, February, 1960, Vol. 
25, No. 1, 8-12. 

1 This report stems from projects connected with 
a weekly seminar on operant conditioning conducted 
at the hospital by the third author. Responsibility 
for the authorship and the post hoc analysis is the 
third author’s; the first two authors are responsible 
for application of experimentally based procedures to 
shape the verbal behaviors of the patients. 

2 The authors wish to express their appreciation to 
Dr. Leonard Horecker, Clinical Director of Anna 
State Hospital, and to Dr. Robert C. Steck, Hospital 
Superintendent, for their encouragement and facilita- 
tion of the project. This investigation was supported 
in part by a grant from the Psychiatric Training and 
Research Fund of the Illinois Department of Public 
Welfare. 
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of the use of operant conditioning to reinstate 
verbal behavior in two hospitalized mute psy- 
chotics. Patient A, classified as a catatonic 
schizophrenic, 40, became completely mute al- 
most immediately upon commitment 19 years 
ago. He was recorded as withdrawn and exhibit- 
ing little psychomotor activity. Patient B, classi- 
fied as schizophrenic, mixed type, with catatonic 
features predominating, was 43, and was com- 
mitted after a psychotic break in 1942, when he 
was combative. He completely stopped verbaliz- 
ing 14 years ago. Each S was handled by a dif- 
ferent E (experimenter). The E's were ignorant 
of each other’s activities until pressed to report 
their cases. This study covers the period prior 
to such report. 


CASE HISTORIES 


Patient A——The S was brought to a group ther- 
apy session with other chronic schizophrenics 
(who were verbal), but he sat in the position in 
which he was placed and continued the with- 
drawal behaviors which characterized him. He 
remained impassive and stared ahead even when 
cigarettes, which other members accepted, were 
offered to him and were waved before his face. 
At one session, when E removed cigarettes from 
his pocket, a package of chewing gum accidentally 
fell out. The S’s eyes moved toward the gum 
and then returned to their usual position. This 
response was chosen by E as one with which he 
would start to work, using the method of succes- 
sive approximation (9). (This method finds use 
where E desires to produce responses which are 
not present in the current repertoire of the organ- 
ism and which are considerably removed from 
those which are available. The E then attempts 
to “shape” the available behaviors into the de- 
sired form, capitalizing upon both the variability 
and regularity of successive behaviors. The shap- 
ing process involves the reinforcement of those 
parts of a selected response which are succes- 
sively in the desired direction and the nonrein- 
forcement of those which are not. For example, 
a pigeon may be initially reinforced when it 
moves its head. When this movement occurs 
regularly, only an upward movement may be re- 
inforced, with downward movement not rein- 
forced. The pigeon may now stretch its neck, 
with this movement reinforced. Eventually the 
pigeon may be trained to peck at a disc which was 
initially high above its head and at which it would 
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normally never peck. In the case of the psychotic 
under discussion, the succession was eye move- 
ment, which brought into play occasional facial 
movements, including those of the mouth, lip 
movements, vocalizations, word utterance, and 
finally, verbal behavior.) 

The S met individually with E three times a 
week. Group sessions also continued. The fol- 
lowing sequence of procedures was introduced in 
the private sessions. Although the weeks are 
numbered consecutively, they did not follow at 
regular intervals since other duties kept E from 
seeing S every week. 

Weeks 1, 2. A stick of gum was held before 
S’s face, and E waited until S’s eyes moved toward 
it. When this response occurred, E as a conse- 
quence gave him the gum. By the end of the sec- 
ond week, response probability in the presence of 
the gum was increased to such an extent that S's 
eyes moved toward the gum as soon as it was held 
up. 

Weeks 3, 4. The E now held the gum before 
S, waiting until he noticed movement in S’s lips 
before giving it to him. Toward the end of the 
first session of the third week, a lip movement 
spontaneously occurred, which E promptly rein- 
forced. By the end of this week, both lip move- 
ment and eye movement occurred when the gum 
was held up. The E then withheld giving S the 
gum until $ spontaneously made a vocalization, 
at which time E gave S the gum. By the end of 
this week, holding up the gum readily occasioned 
eye movement toward it, lip movement, and a 
vocalization resembling a croak. 

Weeks 5, 6. The E held up the gum, and said, 
“Say gum, gum,” repeating these words each time 
S vocalized. Giving S the gum was made con- 
tingent upon vocalizations increasingly approxi- 
mating gum. At the sixth session (at the end of 
Week 6), when E said, “Say gum, gum,” S sud- 
denly said, “Gum, please.” This response was 
accompanied by reinstatement of other responses 
of this class, that is, S answered questions re- 
garding his name and age. 

Thercafter, he responded to questions by E 
both in individual sessions and in group sessions, 
but answered no one else. Responses to the dis- 
criminative stimuli of the room generalized to E 
on the ward; he greeted E on two occasions in the 
group room. He read from signs in E’s office 
upon request by E. 

Since the response now seemed to be under 
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the strong stimulus control of E, the person, at- 
tempt was made to generalize the stimulus to 
other people. Accordingly, a nurse was brought 
into the private room; S smiled at her. After a 
month, he began answering her questions. Later, 
when he brought his coat to a volunteer worker 
on the ward, she interpreted the gesture as a 
desire to go outdoors and conducted him there. 
Upon informing E of the incident, she was in- 
structed to obey S only as a consequence of 
explicit verbal requests by him. The S thereafter 
vocalized requests. These instructions have now 
been given to other hospital personnel, and S 
regularly initiates verbal requests when nonverbal 
requests have no reinforcing consequences. Upon 
being taken to the commissary, he said, “Ping 
pong,” to the volunteer worker and played a game 
with her, Other patients, visitors, and members 
of hospital-society-at-large continue, however, to 
interpret nonverbal requests and to reinforce them 
by obeying S. 

Patient B.—This patient, with a combative his- 
tory prior to mutism, habitually lay on a bench 
in the day room in the same position, rising only 
for meals and for bed. Weekly visits were begun 
by E and an attendant. During these visits, E 
urged S$ to attend group therapy sessions which 
were being held elsewhere in the hospital. The 
E offered S chewing gum. This was not accepted 
during the first two visits, but was accepted on 
the third visit and thereafter. On the sixth visit, 
E made receipt of the gum contingent upon S’s 
going to the group room and so informed S. The 
S then altered his posture to look at E and accom- 
panied him to the group room, where he seated 
himself in a chair and was given the gum. There- 
after, he came to this room when the attendants 
called for him. 

Group Sessions 1-4. Gum reinforcement was 
provided for coming to the first two weekly ses- 
sions, but starting with the third, it was made 
contingent upon S’s participation in the announced 
group activity. The group (whose other members 
were verbal) was arranged in a semicircle. The 
E announced that each S would, when his turn 
came, give the name of an animal. The E imme- 
diately provided gum to each S who did so. The 
S did not respond and skipped his turn three times 
around. The same response occurred during the 
fourth session. 

Group Session 5. The activity announced was 
drawing a person; E provided paper and colored 


chalk and visited each S in turn to examine the 
paper. The S had drawn a stick figure and was 
reinforced with gum. Two of the other patients, 
spontaneously and without prior prompting by E, 
asked to see the drawing and complimented S. 
Attendants reported that on the following day, S, 
when introduced to two ward visitors, smiled and 
said, “I’m glad to see you.” The incident was 
followed by no particular explicit consequences. 

Group Session 6. The announced activity was 
to give the name of a city or town in Illinois. The 
S, in his turn, said, “Chicago.” He was rein- 
forced by E, who gave him chewing gum, and 
again two members of the group congratulated 
him for responding. Thereafter, he responded 
whenever his turn came. 

After the tenth session in the group, gum rein- 
forcement was discontinued, The S has continued 
to respond vocally in the situations in which he 
was reinforced by E but not in others. He never 
initiates conversations, but he will answer various 
direct questions in the group sessions. He will 
not, however, respond vocally to questions asked 
on the ward, even when put by E. 


DISCUSSION 


Both S’s came from special therapy wards of 
patients selected because of depressed verbal be- 
havior and long stay in the hospital; tranquilizing 
drugs were not used. The extent to which rein- 
statement of verbal behavior was related to the 
special treatment offered the patients in the special 
wards set up for them cannot readily be assayed. 
Among the special treatments accorded them were 
group therapy sessions. Nevertheless, the simi- 
larities between the pattern of reacquisition of 
verbal behavior by the patients and the patterns 
of learning encountered in laboratory studies sug- 
gest that the conditioning procedures themselves 
were involved in the reinstatement of verbal be- 
havior. 

In the case of Patient A, the speaking response 
itself was gradually shaped. The anatomical re- 
lation between the muscles of chewing and speak- 
ing probably had some part in E’s effectiveness. 
When a word was finally produced, the response 
was reinstated along with other response members 
of its class, which had not been reinforced. The 
economy of this process is apparent, since it elimi- 
nates the necessity of getting S to produce every 
desired response in order to increase his reper- 
toire, In this case, E concentrated on one verbal 
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response, and in reinstating it, reinstated verbal 
responses in general. On the stimulus side, when 
the response came under the stimulus control of 
E, the stimulus could be generalized to other 
members of E’s class of discriminative stimuli, 
namely, people. This may have relevance for the 
clinical inference of the importance for future 
interpersonal relations of prior identification with 
some person. In the case of Patient B, the stimu- 
lus control involved a given setting, the rooms 
where he had been reinforced. The discrimina- 
tion of E in one case, and not in the other, may 
be explained in terms of the establishment of 
operant discrimination, which also involves ex- 
tinction (9). Operant discrimination is estab- 
lished when a response in the presence of S?, a 
discriminative stimulus, is reinforced, and a re- 
sponse in the presence of $3, a stimulus other than 
SP, is not. After some time, the response will 
occur when SP is presented, but not when $ò is 
presented; the response discriminates S” from S4, 
it having been extinguished when S3 was pre- 
sented. In the case of Patient A, E was with $ 
on the ward, in the group room, and privately. 
Reinforcement occurred in all occasions. But S 
was on the ward (and other rooms) without E, 
and therefore without reinforcement for those re- 
sponses which were occasioned by the ward and 
which only E reinforced. Hence, these responses 
would extinguish in the ward alone, but would 
continue in the presence of E, defining discrimina- 
tion of E from other stimuli. In the case of 
Patient B, this process may have been delayed by 
the fact that E and the other patients reinforced 
only in a specific room. It will be recalled that 
attendants rather than E brought S to the group 
room, 

Interestingly, in the group sessions, when Pa- 
tient B emitted the responses which E reinforced, 
other psychotic patients also reinforced Patient B. 
They were thereby responding, on the occasion of 
$’s responses (discriminative stimuli for them), in 
the same way that E did. The term identification, 
used as a label here, shares some behavioral 
referents with the term as used in the preceding 
paragraph and might be explained behaviorally 
in terms of the generalized reinforcer (10). These 
behaviors by the patients are similar to behaviors 
reported in client-centered group sessions, where 
clients increase in reflective behaviors as counsel- 
ing progresses, and in psychoanalytic group ses- 
sions, where patients increasingly make analytic 
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interpretations of each other. Here, the patients 
are also behaving like the therapist. While this 
parallel lends itself to the facetious thought that 
operant group sessions may produce operant con- 
ditioners, it does suggest that psychotics are be- 
having, with regard to responses by the major 
source of reinforcement in the group, according 
to the same laws which govern such group be- 
haviors of non-hospitalized S’s. 

The various diagnostic labels applied to psy- 
chotics are based to a considerable extent upon 
differences between responses considered abnor- 
mal, for example, hallucinations, delusions of per- 
secution, and the like. The therapeutic process 
is accordingly at times seen in terms of eliminat- 
ing the abnormal behaviors or states. Experimen- 
tal laboratory work indicates that it is often ex- 
tremely difficult to eliminate behavior; extinction 
is extremely difficult where the schedule of rein- 
forcement has been a variable interval schedule 
(6), that is, reinforcement has been irregular, as 
it is in most of our behaviors. Such behaviors 
persist for considerable periods without reinforce- 
ment. Experimental laboratory work has pro- 
vided us quite readily with procedures to increase 
responses. In the case of psychotics, this would 
suggest focusing attention on whatever normal 
behaviors $ has; an appropriate operant, no mat- 
ter how small or insignificant, even if it is con- 
fined to an eye movement, may possibly be raised 
to greater probability, and shaped to normal be- 
havior (8). Stated otherwise, abnormal behaviors 
and normal behaviors can be viewed as recipro- 
cally related, and psychotics as exhibiting consid- 
erable abnormal behavior, or little normal behav- 
ior. Normal behavior probability can be in- 
creased by decreasing probability of abnormal 
behaviors, or abnormal behaviors can be de- 
creased by the controlled increase of normal 
behaviors. This preliminary report suggests that 
a plan of attack based upon the latter approach 
may be worth further investigation. 


SUMMARY 


Verbal behavior was reinstated in two psy- 
chotics, classified as schizophrenics, who had been 
mute for 19 and 14 years. The procedures uti- 
lized involved application of operant conditioning. 
The relationship of such procedures, based on 
controlled laboratory investigations with men and 
animals, to procedures based on clinical practice 
with human patients was discussed and was con- 
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sidered as directing our attention to shaping and 
increasing the probability of what normal behav- 
iors the psychotic possesses. 
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